Data with 2200 adults shows that the internet

       Data mining is the process of discovering patterns in large data sets . The overall goal of the data mining process is to transform it into an
understandable structure for further use and extract information from a data
set. Graph mining is the process of analyzing and gathering the data
represented as graphs. The incredible rising of on-line social networks gives a
new and very strong interest to the set of techniques developed since several
decades to mining graphs and social networks. In this paper we overviewed graph
mining tasks and the tools which are used for the mining of data represented as
graphs.

Keywords:
Data Mining, Graphs, Graph Mining, Gephi, Network X.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

1.       Introduction:

          Data mining is a term from computer science. Sometimes it is
also called knowledge discovery in
databases (KDD). Although Tim
Berners-Lee envisioned a read/write Web (the very first browser also worked as
an HTML editor), the Web was a read-only medium for a majority of users. The
web of the 1990s was much like the combination of a phone book and the yellow
pages (a mix of individual postings and corporate catalogs) and despite the connecting
power of hyperlinks it instilled little sense of community among its users. This
passive attitude toward the Web was broken by a series of changes in usage patterns
and technology that are now referred to as Web 2.0, a buzzword coined by Tim
O’Reilly.

In the following, we
summarize the history and the defining aspects of Web 2.0. The set of
innovations in the architecture and usage patterns of the Web led to an
entirely different role of the online world as a platform for intense
communication and social interaction. A recent major survey based on interviews
with 2200 adults shows that the internet significantly improves Americans’
capacity to maintain their social networks despite early fears about the
effects of diminishing real life contact. The survey confirms that not only
networks are maintained and extended online, but they are also successfully activated
for dealing with major life situations such as getting support in case of a major
illness, looking for jobs, informing about major investments etc.  

The first wave of
socialization on the Web was due to the appearance of blogs, wikis and
other forms of web-based communication and collaboration. Blogs and wikis
attracted mass popularity from around 2003. What they have in common is that
they both significantly lower the requirements for adding content to the Web:
editing blogs and wikis did not require any knowledge of HTML any more. Blogs
and wikis allowed individuals and groups to claim their personal space on the
Web and fill it with content at relative ease.

 

1.1  
Social Network:

Network
analysis, a branch of sociology and mathematics that is increasingly applied
also to questions outside the social domain. By no means do we expect to
provide a complete coverage of any topic involved. For a more encyclopedic
treatment of network analysis we refer the reader to the social network
analysis reference of Wasserman and Faust. Network analysis should appeal to
all as one of the most formalized branches of Social Science. Most of these formalisms
are based on the simple nodes and edges representations of social networks to
which a large array of measures and statistics can be applied.

What is network analysis?

Social Network Analysis
(SNA) is the study of social relations among a set of actors. The key
difference between network analysis and other approaches to social science is
the focus on relationships between actors rather than the attributes of
individual actors.

Network analysis takes a
global view on social structures based on the belief that types and patterns of
relationships emerge from individual connectivity and that the presence (or
absence) of such types and patterns have substantial effects on the network and
its constituents. In particular, the network structure provides opportunities and
imposes constraints on the individual actors by determining the transfer or
flow of resources (material or immaterial) across the network.

The focus on
relationships as opposed to actors can be easily understood by an example. When
trying to predict the performance of individuals in a scientific community by
some measure (say, number of publications), a traditional social science approach
would dictate to look at the attributes of the researchers such as the amount of
grants they attract, their age, the size of the team they belong to etc. A
statistical analysis would then proceed by trying to relate these attributes to
the outcome variable, i.e. the number of publications.

In the same context, a network analysis study would focus on the interdependencies
within the research community. For example, one would look at the patterns of relationships
that scientists have and the potential benefits or constraints such
relationships may impose on their work. For example, one may hypothesize that certain
kinds of relationships arranged in a certain pattern may be beneficial to
performance compared to the case when that pattern is not present. The patterns
of relationships may not only be used to explain individual performance but
also to hypothesize their impact on the network itself (network evolution).
Attributes typically play a secondary role in network studies as control
variables

Comments are closed.