Volume 6 Issue 2
Fall 2010
ISSN 1937-7266

A Unified Framework for Social Networks

Robert Meusel

PhD Candidate
KR & KM Research Group
University of Mannheim, Germany

Abstract

The goal of my PhD thesis is the development of a new framework and methodology to reach the global potential of social networks and therefore support the ongoing growth of those structures. The framework is based on a flexible and efficient semantic data structure and respects the idiosyncrasies of different networks, such as interest networks and business networks. Based on this new unified framework, network analysis methods are implemented and evaluated to reach the potential of massive amount of available data. These methods deliver aggregated data for internal and external services like recommendation of items and connections or market research.

We present the motivation for the thesis that forms the basis for the problem definition. After a review of the current state of the art and the related work, our approach and the research methodology are described. In the end a short overview of the current status is given.

1 Motivation

In recent years the Internet became an inevitable part of the daily life of almost one sixth of the world population (around 41% in the Asian area and 28% in Europe ZDNet.de 2010)1 and the generation of digital social natives arose. Millions of bytes of data are created and transferred every day through the Internet; in particular, social networks attract more and more people every day. Although the willingness of participants of communities to share and publish information is huge, the networks are lacking in methods of monetization, which inhibits their growth. Paid services, such as targeted advertisement, and market research are two possibilities that network operators could provide to the economy to create a win-win situation. Likewise, the integration of social networks into information retrieving systems like digital libraries or e-commerce platforms is promising and could also create a win-win situation for both parties involved.

Although these possibilities are completely different, they lead to two essential challenges, which must first be overcome. First, the heterogeneities of data structures and the data within the networks prevent unified methods to be easily applied to them. Second, meaningful analyses using the huge amount of available but more or less low quality data are not yet available. Additionally, these analyses and their access to the sensible data are subject to strict privacy rules.

The main goal of reaching the full potential of social networks is to adapt already existing methods, which are successfully used within the areas of semantic web, digital libraries and traditional market research, to this new circumstance. Additionally it is interesting to survey how, for example, digital libraries could use the services of social networks to recommend books to users based on what others have previously borrowed, and get users with similar taste in touch with each other.

2 State of the Art

In recent years a huge amount of new social networks have arisen all over the world. These networks exist in several different languages with different orientations and topics. Some of them are really general with no special focus in topic and audience, such as Facebook or hi5. Some of them have a clearly defined orientation like business relations, and some of them have a strict controlled audience like schülerVZ, a German social network especially for the pupil. This social network is monitored by the operators and the sign-in is only allowed for invited pupils. Additionally there are social networks which, at first sight, do not look like one. Twitter, first presented in March 2006, belongs to the group of "microblogs" but can also be seen as social network. It includes a small profile, relationships between the participants, and communications based on "tweets". These "tweets" usually have a special topic, but they can also be used to communicate directly with other Twitter participants. The big advantage of Twitter is the basic privacy policy they follow. By default, all profiles and all communications are public.

Facebook, who protects most of the data in a profile by default, was one of the first social networks which successfully implemented application programming interfaces (APIs) to allow internal services (fan pages and applications like Mafia Wars) to access the users' data. Following Facebook some other social networks also started providing these kinds of interfaces for services within the network but did not follow the same "all-access" strategy that Facebook did. The VZ Group (meinvz.de, studivz.de, schülervz.de) implemented a second, by default, empty profile which is accessible by the API and does not offer the same richness of information.

In April 2010 ago Facebook took the next step by publishing the Facebook Single Sign-On SDK2 and the Facebook Graph API3. This system is an extension of the Facebook Connect system which was available before. This SDK is especially made for external applications like web pages, providing a login section, external game platforms where users can register easily with their Facebook login. Doing this they have to grant a variety of permissions to the external application. Besides accessing the user's personal data like age, gender, interests and others, the applications may now be able, if the user grants the according permission, to take control over the users' "feeds"4. In this particular case the application is able to post on the signed-in users' wall without asking again. A possible case is the posting of new high scores directly into the users' profile and also mentioning what a user has bought in this moment in an online store. Using this API it is not very difficult anymore for other pages, whether they are social networks, e-commerce platforms or normal web pages, to gain access and gather data from users out of their Facebook profiles. Unfortunately there is no way to populate data that is only available on an external network or page into the Facebook network. The Facebook Connect is a one-way connection. It is possible to get data, but not to push additional profile data into the network, which means that the Facebook network could not be used as unified social network framework without losing data because it is not flexible enough.

Although a lot of data is now almost "freely" available from and in the network, only a few services that help the network to earn money are available. Facebook launched some weeks ago an advertisement requesting an API that allows applications to request advertisements from the Facebook internal ad campaigns. This services is mainly focused on monetizing the free services offered by the platform, which is in fact one of the main problems of social networks. Although they have a paid "premium" model, they have a lot of competitors and not every social network has the same development power than Facebook has. They cannot afford to always launch new features to attract participants and increase the number of users.

Another interesting service was launched by Google and is called Social Graph API 5. This API allows exploration of connections between pages and individual persons within the World Wide Web. The API uses public information of pages and profiles to offer additional information about users of web pages and applications. This information could help developers and designers to adjust their content to the special interests of their audience.

3 Problem Description

Definition 1 Looking at informal, transient forms of association, such as the flow of gossip, the mobilization of social movements and political campaigns, and the maintenance of patron-client relations. Such networks are groups of persons who do not necessarily know each other or share anything outside the organizing criteria of the network.

The definition given above by Calhoun [5] explains the construct of social networks more clearly. Based on this definition, social network analysis is a well established technique of social sciences for analyzing social communities and the role of individual persons in these communities. Traditionally, information about the structure of the social network and the characteristics of the participants involved in the network had to be acquired manually using survey techniques [12].

The increasing use of modern information and communication technologies has improved this situation, since these new media can be used as a source of information about relations between techniques by analyzing phone calls, email exchanges or co-occurrence of names on web pages. Since a couple of years ago, social networks like Facebook, QQ and hi5 are being used by many people, who provide information about their interests, activities and their social network by themselves. By now, there are more than a hundred of these platforms in Germany alone, covering all kinds of communities from businessmen to vegetarians. While these platforms provide invaluable data for social network analysis, they also come with problems concerning the ability to perform meaningful analyses: Because the data is entered by the users themselves and not acquired in systematic surveys, the data is more heterogeneous and has a lower quality. In order to be able to benefit from the data available from social networks, methods for dealing with the heterogeneity and the low quality of the data have to be developed and evaluated.

Because of these issues a comprehensive data integration of the available information in social networks has to be performed first. Secondly, meaningful network analysis methods supporting internal services like recommendations as well as external services like targeted advertisement and market research have to be implemented within this unified social network framework. During the whole process, a special focus has to be put on security and privacy of the used data.

3.1 Data integration

The outcome of the integration has to be a unified social network framework, which could be easily extended to serve new needs that constantly arise. The created structure has to return results on time-critical queries within a sufficient time frame as well as on resource-intensive queries based on limited resources. Currently there are different ideas like OpenID and Facebook Connect to populate the participants' data from one "master" network into another "slave" network and lower the hurdle of entry. Unfortunately, this data population does not respect peculiarities of other networks and does not allow a backwards population, which leads to distributed data.

Taking these facts into account three different levels of data integration can be defined.

The Structural Integration The structural integration is the easiest of the tree tasks. For this task, it is possible and sensible to resort to existing technologies for representing heterogeneous data using semantic web technologies. In particular, the Resource Description Framework (RDF) has been designed as a lingua franca for representing heterogeneous data on the web, providing a common syntax and data model for representing information from different sources.

The Semantic Integration The semantic integration will homogenize the data on the semantic level. This step mostly consists of disambiguating data and relating the descriptions of participants and relations in the different networks. For instance, most social networks allow their members to describe personal information such as activities, interests and favorites. These descriptions are often done in a uncontrolled vocabulary leading to a situation where people will use different terms for referring to the same interests ('cinema' vs. 'going to the movies') and describe things at different levels of granularity ('sports' vs. 'Archery, i.e. traditional wand shoot'). The main task at this step is to identify synonymous descriptions ('cinema' vs. 'going to the movies') and specialization relationships between descriptions ('sports' vs. 'Archery, i.e. traditional wand shoot').

Object Reconciliation Object Reconciliation is the last step in order to complete the data integration of different networks. This step, also referred to as 'object-' or 'reference reconciliation,' is based on the analysis of the description of individual participants and their relation to other participants. The underlying assumption is that different descriptions representing the same participant will have a lot in common. This step is complicated by the fact that due to privacy restrictions, data from online social networking platforms will only be available in an anonymized form. This means that significant information like the name or the address of the person cannot be used for object reconciliation. Additionally, due to the various domains of social networks, some descriptive data of the same participant may differ from network to network.

3.2 Security and Privacy

When talking about a framework which makes private data available to different services, security and privacy of data plays an important role and has to be considered carefully. Primarily the privacy of the data has to be guaranteed so that it is impossible to access data violating the restrictions of the different network EULAs6. It must be possible for the networks which allow the access to the data to restrict the visibility and usability according to their policies. An example could be that no data is given to external services directly, but can be used by them to calculate projections and return only aggregated data if the number of included data sets follows the preset restriction. Another policy could be that only data which is marked as public could be used by external services, and all others can also used by internal services. Here it is imaginable that the restrictions are adjusted by the users themselves. Another problem arises when sets of data which represent a real person are combined. Most of the single data sets will be anonymized (only an ID will be stored to deduce back to the user within the single networks) so that no inference could be made by the single attributes. But the combination of some sets of attributes may lead to a obvious inference to the real person. A simplified example could be that one network provides the professional status of a person and its residence. The other provides the interests and again another, the age and gender. By combining this it may be possible to reduce the fitting persons to a manually checkable amount. This effect which could arise by object reconciliation has to be prevented by the framework.

Moreover, general security issues (hacking, security gaps, etc.) have to be observed.

3.3 Network Analysis

The network analysis covers all methods and functionalities which will provide input for internal and external services. Services like product recommendation, targeted advertisements or market research based on the unified data will increase the market power of service provider and service user and thereby ensure the ongoing grows. Additionally the internal services will add value to information retrieval systems including a social network. Here two interesting analyses are important for the mentioned services.

Influencer Analysis Influencer analysis means the identification of important people based on their social network. In market research it is important to identify people that have a strong influence on others because marketing actions specifically targeting these 'influencers' will be especially effective. Influencer analysis can also be used to identify experts on certain topics.

Attribute Prediction Attribute prediction is the second important functionality. This function is the ability to predict the attributes of a participant based on limited information. In traditional market research, the population of potential customers is separated into a limited number of target group milieus with certain predefined interests and properties. In information retrieval this prediction can be used to recommend unknown items to a participant based on their affiliation to a group of other known items.

3.4 Research Questions

The main question of this PhD thesis is: Is it possible to build a framework which includes or has the ability to access data from different social networks and provide a unified interface for services, internal and external, to generate added value for services users and thus help social networks to grow? This question can be resolved in the following three sub-questions, which will be answered in this PhD thesis. The two first questions therefore are related to the functionality and methodology of the framework. The last question should answer generically if external applications could benefit from this framework.

  1. Is the designed framework capable of reacting to new social network mannerisms and can it still serve both time-critical as well as resource-intensive requests?
  2. Do the implemented internal and external services generate added value while observing the privacy and security restrictions of the data?
  3. Can a digital library benefit from the services provided by an integrated social network?

4 Related Work

There are a lot of approaches related to the interrogation of this proposal caused by the variety of included steps.

4.1 Data Integration

In the first place, concerning the structural data integration, San Martin and Gutierrez [14] proposed recently a general framework for representing and manipulating social network information in RDF. Their approach is based on the notion of a bipartite graph, where participants and relations between participants represent disjointed node sets and features a number of predefined transformations of the network data to support different types of analyses. Further, they show that many of the common analyses can be conducted by executing SPARQL queries on the transformed models. Referring to the semantic integration of the social network data different approaches were used in the past to build and maintain thesauri. Grefenstette [10] proposed a method based on co-occurrence of concepts to group relevant nouns out of documents and explore relations between them. Niepert et al. [15] used answer set programming on expert feedback to explore new semantic correlations in the area of philosophy. Meusel et al. [18] modified and extended a method based on the results of Bollegala et al. [3] to semi-automatically enhance existing thesauri with new concepts from documents. According to the main idea of this approach--using relatedness measure based on the definition of Lin [3] to maintain concept hierarchies--there are a lot more measures that could be used for this purpose (e.g. [24] or [9]). In order to identify descriptions representing the same participant Champin and Solnon [2] used a similarity measure for labeled graphs to calculate the ego-centric network similarity of different individuals. Recently Rowe [19] has proposed a method for object reconciliation that uses the social network of participants as a criterion showing the feasibility of this approach.

4.2 Network Privacy

Regarding the privacy and security issues of the available data Giereth [8] made an interesting suggestion on how data within a RDF-Graph could be encrypted to ensure structural and content-based information being only shown to requesters with the required security roles. His approach ensures that no serialization problems have to be handled and that all data--structural and content-based--stays within the RDF-Graph.

4.3 Network Analysis

A typical measure to identify influencers in social networks was proposed by Wassermann [21] and is based on the measures of centrality. These measures analyze paths in the social network graph and identify participants holding strategic positions in the network. A method to predict attributes by using already known attributes of participants in the social network and to determine association rules defining relations between certain interests and activities was proposed by Yang et al. [23], Domingos et al. [6] and Schwartz et al. [20]. Another way to predict attributes is by using clustering techniques like Jain et al. [11]. In this approach, participants with common interests are clustered into a group and interests are assigned globally to groups. In another approach, Agrawal et al. [1] used an association rule mining to identify attributes that normally occur together without clustering participants first. Another promising idea was presented by Popescul et al. [17], who used statistical relational learning to predict links between objects based on attributes. Breese et al. [4] used collaborative filtering to predict attributes. Some of this information has also been successfully used for analyzing large social networks (cf. [22] and [7]).

5 Approach and Extended Contribution

The general design of the framework could be seen in figure 1. In the following, a short example shows the benefit social networks and service provider could generate with the help of such a framework. Subsequently the main idea and the different steps will be explained briefly.



Figure 1: Functional Design of Social Network Framework (Click here for a larger view of the image)

5.1 An Example

Think about a medium-sized social network with a special group of users that has common interests but belong to different demographic and social groups; it wants to offer free services to all users, but also has to generate money somehow. The only things they can offer are data. But they do not want to sell the data and annoy their users, but use it, for example, to provide advertisers the possibility to attract special groups of users with certain advertisements or offers. Because the network is too small and does not have the capacity to negotiate with different media agencies, searching for advertisers or managing the ad inventory on their own can hardly earn money. By linking their data to the structure provided by the unified social network data, no matter if they import the data directly or use a service which links the data into the internal structure, they could benefit from advertisers which offer advertisements within the network to a variety of other social networks. The benefit for the social network is obvious, they link their data once, and can use the services and benefit. The advertiser reaches only the people he wants to reach, and with each network that links itself to the network, the reach increases and the granularity of the targeted ads can be improved. Similarly, a lot of win-win situations exist where a unified structure can help to generate added value.

5.2 The Approach

The contribution of this thesis is the development of the framework and methodology sketched in the above figure 1. The framework that will be implemented will include a unified data structure and different predefined analyses delivering data for internal and external services.

In the first step, data from different social networks is imported into the unified, semantic data structure of this framework. During the import, the data is normalized and semantically integrated with the help of a permanently maintained concept structure. After the data has been imported completely, individuals (real world persons) have to be identified and their different profiles have to be merged and/or linked. In the next step, different network analysis methods are implemented and evaluated based on this data. This evaluation will not only pay attention to correctness but also to the added value generated by the output for the internal and external services/functions such as recommendation, targeted advertisements and market research. Subsequently, the created functionalities are applied to the users of a information retrieval systems and evaluated according to the additional value they generate. A special focus will be set on digital libraries and how document recommendations and interactions lead to a more efficient and easier use of digital libraries.

The following steps are explained in more detail:

The structural homogenization In a first step the results of San Martin and Gutierrez [14] will be used to create a basic framework. According to the special needs of different networks, it will be necessary to adapt several parts of this structure. Additionally an object-oriented database structure and a relational database structure are evaluated based on the same set of data. The evaluation is done based on a set of queries, representing real world requests. Those queries include time-critical and resource-critical requests. According to the results of these evaluations the best representation will be used to establish a corporate social network framework.

Network privacy Already during the schema creation process the aspect of data privacy has to be considered. Here the approach of Giereth [8] will be taken into account and partly implemented into the framework. A possible drawback could result from the time the system needs to encrypt the graph and reorganize it. Using a set of representative queries the applicability of this approach will be evaluated for this framework and different modifications will be made regarding performance issues if necessary.

Data integration At this point, different possibilities have to be considered when talking about the way data can be made accessible to the framework. On the one hand it has to be possible for social networks themselves to import their anonymized data into the framework with the help of a simple importing interface. On the other hand, the individual user should be able to add their data from a special social network into the framework. The interface has to provide possibilities to recognize similarities between the framework structure and the social network specific structure and make suggestions about import strategies. However, the "importer" has to have the possibility to add their own attributes and connections and tell the network which kind of connections and attributes are added, so that the services running on the unified structure can use this "new" data for their analysis. The recognition process will be tested with different ontology matching strategies, for example the one Noessner et al. presented in 2010 [16]. During the import process also the data privacy policies have to be declared.

Data homogenization A huge amount of information is stored in unformatted text within the social network and therefore a lot of methods which are already established in the area of digital library and knowledge management are helpful. The main goal is to build up and include a concept hierarchy--thesaurus or ontology--which can be used to homogenize the included information. The idea of extracting relevant information from a huge number of available sources based on term weighting methods like tf-idf is promising. To create the underlying concept hierarchy different methods will be taken into account, for example the one of Meusel et al. [18] or Niepert et al. [15].

The network and data analysis This part is one of the last tasks this PhD thesis will cover. Because this thesis is strongly economically driven, the prediction of attributes and the identification of influencers will be considered first because they are highly interesting for market research and advertising.

6 Research Methodology

In this thesis, the following steps are to be taken:

6.1 Theoretic Considerations

After defining the problem and clearly formulating the research questions, the various problems which occur while identifying a sufficient data structure and meaningful network analysis have to be outlined. The framework which will be implemented has to be sketched with respect to the different problems and their possible solutions.

6.2 Analysis of existing solutions

Existing solutions will be analyzed and evaluated to ensure that the methodology supports and extends existing solutions and contributes new achievements.

6.3 Quality Measures and Evaluation

The quality of the structure will be measured based on comparable data structures, like relational and object-oriented databases. The same meaningful queries will be used--representing real-world request in social networks--to prove the quality of the designed semantic data structure.

Because of the strong economic background of this PhD thesis, the implemented services like targeted advertisements and market research will be evaluated based on existing implementations and also on the return of invest (ROI) they result in.

6.4 Proof of Concept

The proof of concept for the single elements of the framework are given by the related approaches that are successfully used dealing with the single problems. A proof of concept study for the global framework will be done by implementing a rudimentary prototype in the next month.

6.5 Framework Design

Based on the requirements of the different networks, the underlying structure of the framework has to be flexible and easily adjustable to new needs. Additionally, not only peculiarities of social network but also peculiarities of information retrieval systems have taken into account when designing the framework.

6.6 Method Development

The development of sufficient methods for the different data integration aspects and the later network analysis has to be as flexible as the underlying data structure. The implemented methods have to be capable of reacting on those changes and have to take the new available information into account to return more valuable output.

7 Status

In the last month, a couple of sets of data have been collected to get an overview of the variety of available information. Different social networks have been analyzed regarding general aspects but also unique peculiarities. The next step is to design a framework and import all the different sets of data. In addition the first data has been organized into a social network thesaurus and maintaining methods have been implemented.

8 Conclusion

This paper is a proposal for the development of "A Unified Social Network Framework". The problem definition and the review of the state of the art presented herein are the first steps done in this direction. The presentation of these steps pursues the goal to receive comments on the intended approach and helpful suggestions regarding the further work. The next step will be the implementation of the sketched framework that enables the development of further analysis methods and experimentations.

Footnotes

1 http://www.zdnet.de/news/wirtschaft_telekommunikation_zahl_der_internetnutzer_weltweit_uebersteigt_milliardengrenze_story-39001023-39201613-1.htm
2 Facebook JavaScript SDK documentation is available under http://developers.facebook.com.iproxy.saverpigeeks.com//docs/guides/web
3 Facebook Graph API documentation is available under http://developers.facebook.com/docs/api
4 A full list of availble publishing channels which are accessible via the api can be found under http://developers.facebook.com/docs/api in the "Publishing to Facebook" section.
5 Social Graph API documentation is available underhttp://code.google.com/apis/socialgraph
6 It has to be mentioned, that the EULAs can change from time to time and that the EULAs of differnt networks can vary tremendously.

References

[1] R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In ACM SIGMOND, pp. 207-216, 1993
 
[2] P. Champin and C. Solnon. Measuring the similarity of labeled graphs. In Proceedings of the Fifth International Conference on Case-Based Reasoning, pp. 80-95. Springer, 2003.
 
[3] D. Bollegala, Y. Matsuo, and M. Ishizuka. Measuring semantic similarity between words using web search engines. In 16th International World Wide Web Conference (WWW2007), 2007.
 
[4] J. S. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pp. 43-52. Morgan Kaufmann, 1998.
 
[5] C. Calhoun. Dictionary of the Social Sciences. Oxford Univ Pr, 2002
[6] P. Domingos. Mining social networks for viral marketing. In IEEE Intelligent System, pp. 80-82, 2005.
 
[7] L. Garton, C. Haythornthwaite, and B. Wellman. Studying Online Social Networks, pp. 75 - 106. London, UK: Sage, 1999.
 
[8] M. Giereth. On partial encryption of rdf-graphs. In International Semantic Web Conference, volume 3729 of Lecture, pp. 308-322. Springer, 2005.
 
[9] J. Gracia and E. Mena. Web-based measure of semantic relatedness. In .Proc. of 9th International Conference on Web Information Systems Engineering (WISE 2008), Auckland (New Zealand, pp. 136-150. Springer, 2008.
 
[10] G. Grefenstette. Explorations in Automatic Thesaurus Discovery, volume 278 of The Springer International Series in Engineering and Computer Science. Kluwer Academic Publishers, 1994.
 
[11] A. K. Jain, M. N. Murty, and P. J. Flynn. Data clustering: A review, ACM Computing Surveys (CSUR), vol. 31 no. 3, pp. 264-323, Sept. 1999.
[12] D. Jansen. Einführung in die Netzwerkanalyse: Grundlagen, Methoden, Forschungsbeispiele, volume 3. Vs Verlag, 2006.
 
[13] D. Lin. An information-theoretic definition of similarity. In In Proceedings of the 15th International Conference on Machine Learning, pp. 296-304. Morgan Kaufmann, 1998.
 
[14] M. San Martín and C. Gutierrez. Representing, querying and transforming social networks with rdf/sparql. In ESWC 2009: European Semantic Web Conference, 2009
 
[15] M. Niepert, C. Buckner, and C. Allen. Answer set programming on expert feedback to populate and extend dynamic ontologies. In Proceedings of the Twenty-First International Florida Artificial Intelligence Research Society Conference, May 15-17, 2008, Coconut Grove, Florida, USA, pp. 500-505, 2008.
 
[16] J. Noessner, M. Niepert, C. Meilicke, and H. Stuckenschmidt. Leveraging terminological structure for object reconciliation. In European Semantic Web Conference - ESWC, 2010.
 
[17] A. Popescul, R. Popescul, and L. H. Ungar. Statistical relational learning for link prediction, in IJCAI Workshop on Learning Statistical Models from Relational Data, 2003.
[18] K. Eckert, R. Meusel, M. Niepert and H. Stuckenschmidt. Thesaurus extension using web search engines. In International Conference on Asia-Pacific Digital Libraries (ICADL), Brisbane, Australia, 2010.
 
[19] M. Rowe. Applying semantic social graphs to disambiguate identity reference. In In proceedings of ESWC 2009, Heraklion, Crete. (2009), 2009.
 
[20] M. Schwartz and D. C.M. Wood. Discovering shared interests among people using graph analysis of global electronic mail traffic. Communications of the ACM, 36:78-89, 1992.
 
[21] S. Wasserman and K. Faust. Social network analysis: Methods and applications, Cambridge: Cambridge Univ. Press, , 1994.
[22] B. Wellman. "An Electronic Group Is Virtually A Social Network," in Culture of the Internet, S. Kiesler, ed. Mahwah, NJ: Lawrence Erlbaum, 1997, pp. 179-205.
[22] Wan-Shiou Yang, Jia-Ben Dia, Hung-Chi Cheng, and Hsing-Tzu Lin. Mining social networks for targeted advertising. In Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS'06) Track 6, volume 6, page 137, 2006.
 
[22] Torsten Zesch, Christof Müller, and Iryna Gurevych. Using wiktionary for computing semantic relatedness. In Proceedings of AAAI, 2008.