Friday, October 29, 2010
How to give a talk
http://cel.archives-ouvertes.fr/cel-00529505/en/
http://www.cs.utexas.edu/users/dahlin/professional/badTalk.pdf
http://www.crhc.illinois.edu/wireless/talks/howto.ppt
http://www.cs.ucr.edu/~michalis/TECHWRITING/presentation-20.html
Lecture #17: Presentation by Anusha & Random Networks
Professor talked about Random Networks
G(n,m) is a random graph with n nodes and m edges. How do we generate one?
Randomly create nodes
Randomly select 2 of those nodes and connect with an edge. Repeat this step.
May end up with directed or undirected so purely random. Another way is to randomly create a node and for each node after the 1st you connect it to a random node. End up with n-1 edges and a directed graph, so not purely random.
P(G) = P^m (1-P)^((2^n)-m)
m=(n/2)P
P = m/(n/2) is the probability of having an edge. Includes self loops, but not double edges.
C =
Random graphs lack high clustering & degree.
Key Points: Generation Model
Wednesday, October 27, 2010
Lecture 18: Small World Networks
Lecture #14 Presentations by Chris Zachor and Daniel Ramos
The title of Daniel Ramos’e lecture is The Rise of Smart Devices on Mobility in Wireless Network. The lecture started with the definition of wired network and wireless network and the main characteristic of both of them. Structure is fixed for wired network and structure is dynamic and changes often for wireless network. The wireless network has lower bandwidth and quality of service than wired networks and the environment can be affected easily. Following is an instruction of Mobile Smart Devices. Mobility studies are important because of the quality of service, bandwidth utilization and roaming problems. Previous studies are focused on simulated mobility models and real data. Study about Modeling Roaming in Large-scale Wireless Networks using Real Measurement and Access and mobility of wireless Personal Digital Assistants (PDA) users have been done. Evolutionary topology model and campus waypoint model were used in the studies. As the conclusion of this lecture we know that studying mobility is important to network protocol evolution, highly mobile devices are already changing both computer and cellular network usage and more actual data and realistic models are needed as the conclusion of the lecture.
Tuesday, October 19, 2010
Lecture 15 - Presentations by Engin and Esra
Next, Esra presented her project about Gene Expression Networks. She explained the term of stress which is the factor that affects the plant grow in the natural environment.Every organism gives different responses to different stresses they are subjected to and they have to improve themselves by adapting changes in environment and to other stress conditions. In general we can say that the plants are affected adversely by these conditions and thus as a result human life will also be affected by the stresses that plants encounters. The main goal of this project is to analyze the stress tolerance of crops; rice, wheat and maize under drought, high salinity and low temperature and study the sets of genes that give any response to these stresses separately. For every stress condition there will be an undirected graph whose vertices correspond to genes, and the vertices of two genes are connected by an edge if their expressions are correlated. Once this network is constructed she will apply the metrics of complex network theory to analyze the network.
Monday, October 18, 2010
Lecture 16 - Presentations by Austin and Guoxun
Next, Guoxun presented his project on aerosol networks. Aerosol is an important study because many pollutants are aerosols and they can have an affect on climate change, health, and the economy. Aerosols are an important global problem because they can travel on trade winds. Guoxun will focus on California and how ozone is transported across the state. Previous work consists a study of Clark County Air Quality in March of 2006 by Robert A. Baxter of T&B Systems. The concentration of ozone increased during commuter hours and was spread widely across the state by winds.
Thursday, October 14, 2010
Lecture #13 : Presenatations by Akgun and Sevincer
Abdullah Sevincer's project is about Funding Network. Funding Networks nodes are government agencies as funding part to authors, their universities and funding as funded part. Funding network has been worked on however such works are not sufficiently detailed. Sevincer will use NSF as basic funding institution because it declares information about funds. He will use NSF data and try to find some relations between authors and fundings, universities and fundings and so on.
Akgun, on the other hand , will work on a Subnet Level Internet Topology Generator. He mentioned that Subnet level topology generator has not been issued yet. Firstly, he explained the need for internet topology generators. Network researchers need realistic internet topologies to run a performance study on their newly designed protocols. Mapping of real internet topologies has some difficulties and it is a time consuming job. Synthetic but realistic internet topology generators have a lot of use in this area. Secondly, he presented a detailed related work section including previous implementations of router-level and AS-level internet topology generators. Finally he talked about his design objectives. A realistic and highly configurable subnet level topology generator will be designed.
Wednesday, October 6, 2010
Lecture #12
The lecture started with Jeff giving his related works presentation on predicting eventual friendships. Through the use of cultural algorithms, katz centrality, and social theory, he wants to predict who you will become friends with. He finds that one of the key problems is current methods only account for a 1 to 2 hop distance in the network. By this account, you could only become friends with a friend of a friend. This, of course, is just a brief overview of some of what Jeff covered in class. For more information, check out his slides on the main CS 790g website.
After that, Dr. Gunes gave a lecture on Graph Data Mining. As mentioned in class, graph pattern mining can be used to do a number of things, including but not limited to: mining biochemical structures, program control flow analysis, and mining XML structures or web communities to name a few. A few of the graph mining algorithms mentioned in the lecture are Incomplete Beam Search, Inductive Logic Programming, and Graph theory-based approaches (this included the Apriori-based approach and the pattern-growth approach.) We covered the two graph theory-based approaches in class.
Two important definitions to remember from this lecture were frequent and support. The support of a graph g is pretty much how often the graph occurs within the larger structure. Meanwhile, a graph (or subgraph) is considered frequent if its support in a given dataset is no less than a minimum support threshold.
Tuesday, October 5, 2010
Lecture 11: Large-scale structure of networks
Another discussion in this lecture was the consideration of shortest paths and the small world effect. Analyzing shortest paths are a commonly computed measurement of networks. When considering shortest paths, we must also discuss the total number of nodes within a network and the small world effect. Essentially, as the number of nodes grow within random networks, the average path length, l, between nodes do not undergo a dramatic change. That is, most random networks have their average path lengths related to a logarithmic scale: l ~ log(n). It is because the fact that l does not change very much with respect to a dramatic increase in the nodes in a network that we define the small world effect. For example, in the current state of the internet, nodes can typically reach other nodes within 10 to 12 hops. Similarly, any human on Earth should be able to reach another human within 6 aquaintances. The small world effect essentially says we are all closely connected than we think. Further, if a network is a power law network, we will get l ~ log(log(n)).
Lastly, our lecture concluded with a discussion on hop distribution, i.e., how many nodes can be reached in n hops. When considering distribution, most centrality measures are similar to the power law except betweeness centrality (on the AS-level). If we examine the graph, we will notice a straight, veritcal line at the beginning of the plot. This indicates all the nodes (ASes) and their betweeness centrality. These nodes are stub networks such as universities that are connected to the network but do not necessarily have anyone passing through them and thus, not serving as a shortest path to anyone--they are on the edge of the network. Further, we observe the internet to lack redundancy as there are not many triangles in which the cluster coefficient is actually less than its expected value.
Lecture 10: Power laws
ln(p(x))=c-aln(x)
Where p(x) is the probability of observing an item of size x; c and are two constants.
Power laws distribution is everywhere. Work frequency, web hits, book sold, telephone calls received and earthquake magnitude all follow the power distribution. During the analysis the power laws distribution, the most common and not very accurate method is do the linear fit line of the whole range in the log-log scale to get the value of a and c. But sometimes, because of the barrier condition or the noise in the heavy tail, this method will cause a big error. There are many way to solve this problem.
First solution is logarithmic binning. Bin data into exponentially wider bins and normalized by the width of the bin. This method can get a good fit line with information lost. Second solution is cumulative binning. The cumulative probability of a power laws probability distribution is also power laws but with an exponent a-1. No loss of information of this method. Third way to do this is chose a good start point to do the fitness, like the distribution of citations to papers, power laws is evident only in the tail. The fourth way is maximum likehood fitting, it is the best way. But to use this method you have to make sure that you have a power laws distribution.
A power laws looks the same no matter what scale we look at it. We call this is scale free.
At the end of our class, to understand the power laws distribution further, we learn a lot of example with power laws distribution and other distribution, like cumulative distribution, power-law with an exponential cutoff, zipf’s law and pareto principle.