聯系方式

您當前位置:首頁 >> Python編程Python編程

日期:2019-12-27 07:05

Project 3

Non-overlapping Community Detection in Large Networks

Due: 2019-12-10

In this project, you are going to detect / reveal significant communities in large

networks using current various community detection (graph mining, clustering)

methods.

1. Detect / reveal communities in the following networks:

http://snap.stanford.edu/data/index.html

Networks with ground-truth communities

Name Type Nodes Edges Communities Description

com-LiveJournal Undirected,

Communities 3,997,962 34,681,189 287,512

LiveJournal online

social network

com-Friendster Undirected,

Communities 65,608,366 1,806,067,135 957,154

Friendster online

social network

com-Orkut Undirected,

Communities 3,072,441 117,185,083 6,288,363

Orkut online social

network

com-Youtube Undirected,

Communities 1,134,890 2,987,624 8,385

Youtube online

social network

com-DBLP Undirected,

Communities 317,080 1,049,866 13,477

DBLP collaboration

network

com-Amazon Undirected,

Communities 334,863 925,872 75,149 Amazon product

network

email-Eu-core

Directed,

Communities 1,005 25,571 42 E-mail network

wiki-topcats Directed,

Communities 1,791,489 28,511,807 17,364 Wikipedia hyperlinks

2. The following methods should be implemented and evaluated:

1) Hierarchical clustering using Jaccard index;

2) Spectral Clustering;

3) CNM (Community Detection in Complex Networks Using External Optimization);

4) HRG (Hierarchical random graph, http://tuvalu.santafe.edu/~aaronc/hierarchy/);

5) Infomap (http://igraph.org/python/doc/igraph.Graph-class.html);

6) Fast Unfolding community detection (http://arxiv.org/pdf/0803.0476v2.pdf,

community detection in networkx package);

7) Multi-Scale Community Detection using Stability as Optimisation Criterion in a

Greedy Algorithm (http://www.elemartelot.org/index.php/programming/cd-code);

8) Multi-Scale Community Detection using Stability Optimisation

(http://www.elemartelot.org/index.php/programming/cd-code);

Igraph (http://igraph.org/python) has implemented methods 1)-5);

networkx (https://networkx.github.io/,

http://blog.sciencenet.cn/blog-404069-337442.html) has implemented 6);

Codes of 7) and 8) are listed as above.

3. Use “community detection benchmark” to evaluate the performances of

community detection methods:

http://arxiv.org/abs/0805.4770

https://sites.google.com/site/santofortunato/inthepress2

--“Package 1 includes the code to generate undirected and unweighted graphs with

overlapping communities. The extent of the overlap can be tuned by input, and it can

be set to zero if one is interested in non-overlapping clusters.”

In this project, we are just focusing on the undirected and unweighted graphs and

non-overlapping communities, such that you can just use the Package1.

Note: The Benchmark graphs generates series of graphs with varying degrees

community structure via changing the mixing parameter μ, for less μ the community

structure is more distinct and earlier to be detected.

4. How to evaluate the performance? The NMI is a good measurement

(http://blog.sina.com.cn/s/blog_45e6be080101dlya.html). You can refer to this source

code:

http://www.mathworks.com/matlabcentral/fileexchange/35625-information-theory-tool

box/content/nmi.m

5. Report your experiments’ details, including the methods you used, the

implementation details, and the performance evaluation, comparison, analysis and

discussion. You are encouraged to improve the current methods, even develop a

novel method to apply in these datasets. The report should be consist of four parts:

1) introduction and related works; 2) method and implementation; 3) experimental

results and 4) analysis and discussion.

6. You can apply the HPC resource for this project via this link:

https://www.must.edu.mo/ssi/labs/hpc.


版權所有:編程輔導網 2018 All Rights Reserved 聯系方式:QQ:99515681 電子信箱:[email protected]
免責聲明:本站部分內容從網絡整理而來,只供參考!如有版權問題可聯系本站刪除。

25选5一等奖多少钱