Tuesday, September 6, 2011

WebCampus

You may e-mail the 1st lab assignment as WebCT has not been enabled yet.

Monday, December 13, 2010

Lecture #27: Network Dynamics


Lecture is about network dynamics, briefly, analysis of the changes that occur
in the network in time. These changes include appearance and disappearance of nodes,
links and also some aggregate metrics of the network. In addition to the analysis
of the known changes, network dynamics deals with predictions about future changes
in the network, like formation and dropping of new edges.

First study mentioned in the class was "Empirical Analysis of an Evolving Social Network".
Paper analyzes a university e-mail network. Ties are weighted with the number of messages
within the last 60 days between the pairs. One major result is that, individuals sharing at
least one class are 3-times more likely to start emailing each other if they have an email contact
in common. Probability significantly increases as number of ties or shared classes increase.
Next part was about the Team Assembly Mechanisms and Evolution of Affiliation Network
Related to Social Network. Examples studies in this field are Broadway Musical Industry and
Collaboration Networks in different sciences. Interesting findings can be summarized as follows;
If your friends join, so will you.
Cliquish communities grow more slowly.

Second part of the lecture was about the results of the affects described in the first.
As the nodes and links change over time, how does the degree distribution,
clustering coefficient and average shortest path change. Graph Densification is analyzed
for physics citations, patent citations, autonomous systems and affiliation networks.
A mathematical model is given. The main result is that number of edges are growing faster
than nodes, so average degree is increasing. Similar studies are carried out for the
diameter change.

In the final part, possible explanation of densification, community structure and difficulty of cross linking, and forest fire model is discussed.

Tuesday, December 7, 2010

Lecture #26: Percolation and Network Resilience

This lecture’s topic was percolation and network resilience. We want to answer the questions of if a given fraction of nodes or edges are removed from a network , how large the connected components are or what the average distance between nodes in the components is. While removing nodes or edges we should consider some points. In bond percolation each edge is removed with probability (1-p). Causing the most damage to the network with the removal of the fewest edges is targeted attack. Usually edges with high betweenness are the targets to make the most damage to network.
Percolation threshold in an Erdos-Renyi graph is the point at which the giant component emerges.
Another fact about the resilience is scale-free networks are resilient with respect to random attacks. However they are not resilient against targeted attacks. Removal of successfully chosen a small number of nodes may cause many more nodes (their neighbors) to be disconnected to the network.
In power grid structural networks each node has a load and a capacity that says how much load it can tolerate. When a node is removed from the network its load is distributed to the remaining nodes. If the load of a node exceeds its capacity then the node fails. Hence we can conclude that network resilience depends on topology of the network.

Colloquium: Epidemic Spreading in Social Networks by Dr Kitsak/December 2nd 2010

The colloquium topic was information diffusion in networks. Dr. Kitsak mentioned about diseases that spread very fast and killed lots of people in the past and recent years also. The most important point here is to understand how information (or disease) spreads based on network structure and what strategies can be applied to overcome the diseases very fast.There are many things that can influence the spread of information within a network. The three main factors in diffusion are network structure (degree distribution, clustering, connected components, etc.), strength of ties (frequency of communication and strength of influence), and spreading agents. A strong tie can be defined by frequent contact, affinity, or having many mutual contacts.

Dr. Kitsak also mentioned three key models for infection. These are SI, SIS and SIR. SIR stands for Susceptible -> Infected -> Removed/Recovered. Each node in the network can be classified by one of these terms in the SIR model. If a node is susceptible, then it can be infected. An infected node is self-explanatory. A removed node means it was infected and removed from the network. Or, a node can recover and build a natural immunity. The SI model is simply Susceptible -> Infected. The definitions remain the same, but a node is unable to recover and therefore remains infected. The SIS model is Susceptible ->Infected -> Susceptible. One of the important network structure that disease can be spread is worldwide airport network. Dr. Kitsak also mentioned about network immunization. One vaccine strategy may be instead of vaccinating all people vaccinate a threshold part so the community can be divided into clusters and virus can't jump to clusters. Dr. Kitsak also told that our community is scale free network.

So who are the most influential spreader in the network?K-shell is the most robust spreading efficiency indicator. So if we want a disease spread quickly we should find different K-shells at different places in the world.As a conclusion Dr. Kitsak pointed out these factors:

1-Almost no epidemic threshold in Scale-Free Networks.
2- Efficient Immunization Strategy: Immunize at least critical fraction fc of nodes so that only isolated clusters of susceptible individuals remain.
3- Immunization strategy is not reciprocal to spreading strategy.
4-Influential spreaders (not necessarily hubs) occupy the innermost k-cores.

Monday, December 6, 2010

Metric computations tools

I have uploaded two tools to computer basic network metric under Course Content. These are useful if you have very large networks that Pajek can not process.

One of the tools is just for betweenness centrality and the other includes a set of metrics and also draws results of some.

Monday, November 29, 2010

Lecture #24: Search in Structured Networks

The lecture talks about searching through structured networks.  It was found through some experimentation that if the network followed a power law it was more searchable. If the network followed a Poisson graph then the network was not as searchable because in this case all the nodes have almost the same degree and all the links are distributed randomly. The most effective way to search a Gnutella network is to search through the highest degree neighbor of the nodes.

The next topic in the lecture is about how people find shortest paths. The strategy to accomplish this is usually a simple greedy algorithm wherein each participants picks a correspondent based on how close they are to the target. In a research carried out by a few researchers to test the accuracy of small world it was found that participants are not very good in routing messages by using the shortest path method as they use local information only. [slide 14-17]

The next topic is about testing a few search strategies on social networks.  In order to perform these tests the email correspondence over labs at HP was monitored over 3.5 months and a network was constructed with nodes such that edges were constructed between individuals who sent emails to each other. It was found that the degree distribution of email senders followed power law. If we considered the filtered network wherein only the participants who sent 6 messages each way were considered, degree distribution followed Poisson distribution and it took 40 steps on average to reach the desired target. In the next strategy, the geographical location of the offices of the participants was considered. It was found that 87% of the 4000 email links were between participants of the same floor. When the organizational hierarchy was considered it was found that hierarchy search was faster than geographical search.

Some research was conducted on virtual community called LiveJournal.  When the degree distribution was observed it followed a log normal distribution rather than power law. The result of a simple greedy algorithm was observed and it was found that 80% of the chains were completed with a slight modification to the basic greedy algorithm search. When the geographical basis of friendships was considered the average user has approximately 2.5 non geographic friends and approximately 5.5 friends based on 1/distance relationship. It was also found that the probability of a person knowing another person doesn’t depend on the absolute distance between the two people but on the relative distance. It is considered that a social network is searchable if a certain fraction r of the messages reaches the target.

Hence, it can be concluded a search can be performed more effectively in a network if the weak ties were also considered in the process and if sophisticated strategies are used.

Wednesday, November 24, 2010

Lecture #25: Information Diffusion

Today's topic was information diffusion in networks. There are many things that can influence the spread of information within a network. The three main factors in diffusion are network structure (degree distribution, clustering, connected components, etc.), strength of ties (frequency of communication and strength of influence), and spreading agents. A strong tie can be defined by frequent contact, affinity, or having many mutual contacts. One important study we looked at was about the strength of weak ties. Despite the name, weak ties can be very useful in spreading information. In the study, the author found that over 55 percent of people in the study found a job through a weak tie. They defined this weak tie as having contact with the person less than two times a week, but more than once a year.

In the lecture, we also discussed two key models for infection. These are SIR and SI. SIR stands for Susceptible -> Infected -> Removed/Recovered. Each node in the network can be classified by one of these terms in the SIR model. If a node is susceptible, then it can be infected. An infected node is self-explanatory. A removed node means it was infected and removed from the network. Or, a node can recover and build a natural immunity. The SI model is simply Susceptible -> Infected. The definitions remain the same, but a node is unable to recover and therefore remains infected.