Special interest tracks and posters of the 14th international conference on World Wide Web - WWW '05, 2005
Link Analysis has been a popular and widely used Web mining technique, especially in the area of ... more Link Analysis has been a popular and widely used Web mining technique, especially in the area of Web search. Various ranking schemes based on link analysis have been proposed, of which the PageRank metric has gained the most popularity with the success of Google. Over the last few years, there has been significant work in improving the relevance model of PageRank to address issues such as personalization and topic relevance. In addition, a variety of ideas have been proposed to address the computational aspects of PageRank, both in terms of efficient I/O computations and matrix computations involved in computing the PageRank score. The key challenge has been to perform computation on very large Web graphs. In this paper, we propose a method to incrementally compute PageRank for a large graph that is evolving. We note that although the Web graph evolves over time, its rate of change is rather slow. When compared to its size. We exploit the underlying principle of first order markov model on which PageRank is based, to incrementally compute PageRank for the evolving Web graph. Our experimental results show significant speed up in computational cost, the computation involves only the (small) portion of Web graph that has undergone change. Our approach is quite general, and can be used to incrementally compute (on evolving graphs) any metric that satisfies the first order Markov property.
Proceedings of the 6th international conference on Web engineering - ICWE '06, 2006
PageRank is a popular ranking metric for large graphs such as the World Wide Web. Current researc... more PageRank is a popular ranking metric for large graphs such as the World Wide Web. Current research techniques for improving computational efficiency of PageRank have focussed on improving the I/O cost, convergence and parallelizing the computation process. In this paper, we propose a "divide and conquer" strategy for efficient computation of PageRank. The strategy is different from contemporary improvements in that it can be combined with any existing enhancements to PageRank, giving way to an entire class of more efficient algorithms. We present a novel graph-partitioning technique for dividing the graph into subgraphs, on which computation can be performed independently. This approach has two significant benefits. Firstly, since the approach focuses on work-reduction, it can be combined with any existing enhancements to PageRank. Secondly, the proposed approach leads naturally into developing an incremental approach for computation of such ranking metrics given that these large graphs evolve over a period of time. The partitioning technique is both lossless and independent of the type (variant) of PageRank computation algorithm used. The experimental results for a static single graph (graph at a single time instance) as well as for the incremental computation in case of evolving graphs, illustrate the utility of our novel partitioning approach. The proposed approach can also be applied for the computation of any other metric based on first order Markov chain model.
2009 International Conference on Computational Science and Engineering, 2009
In recent years, researchers have taken notice that virtual environments such as EverQuest II ser... more In recent years, researchers have taken notice that virtual environments such as EverQuest II serve as a major mechanism for socialization. In particular, educational research has found virtual environments to be a sound venue for studying learning, collaboration, social participation, literacy in online space, and learning trajectory at the individual level as well as at the group level. The present research is concerned with learning in virtual environments and examines online player performance in EverQuest II, a popular massively multiplayer online role-playing game (MMORPG) developed by Sony Online Entertainment. The study uses the game's player performance data to devise performance metrics for online players. The study reports three major findings. First, we show that the game's point scaling system over-estimates performances of lower level players and under-estimates performances of higher level players. We present a novel point scaling system based on the game's player performance data that addresses the under-estimation and over-estimation problems. Second, we present a highly accurate predictive model for player performance as a function of past behavior. Third, we show that playing in groups has impact on player performance and that individual characteristics alone are not sufficient for explaining individual's performance, which calls for a different set of performance metrics methods. The discrepancy between the point scaling system in the game and observed player performance can be used as a guide to modify the existing system to better reflect the expected learning behaviors in different levels.
With social interaction playing an increasingly important role in the online world, the capabilit... more With social interaction playing an increasingly important role in the online world, the capability to extract latent com- munities based on such interactions is becoming vital for a wide variety of applications. However, existing literature on community extraction has largely focused on methods based on the link structure of a given social network. Such link-based methods ignore the content of
2010 IEEE International Conference on Data Mining, 2010
This paper presents a generalized version of the linear threshold model for simulating multiple c... more This paper presents a generalized version of the linear threshold model for simulating multiple cascades on a network while allowing nodes to switch between them. The proposed model is shown to be a rapidly mixing Markov chain and the corresponding steady state distribution is used to estimate highly likely states of the cascades' spread in the network. Results on a variety of real world networks demonstrate the high quality of the estimated solution.
The rich social media data generated in virtual worlds has important implications for business, e... more The rich social media data generated in virtual worlds has important implications for business, education, social science, and society at large. Similarly, massively multiplayer online games (MMOGs) have become increasingly popular and have online communities comprising tens of millions of players. They serve as unprecedented tools for theorizing about and empirically modeling the social and behavioral dynamics of individuals, groups,
... In 1993 [3], Mark Glickman sought to improve upon the Elo rating system by addressing the rat... more ... In 1993 [3], Mark Glickman sought to improve upon the Elo rating system by addressing the ratings reliability issue in the Glicko rating system ... MSR-TR-2006-80 (2006) 5. Yukelson, D.: Principles of effective team building interventions in sport: A di-rect services approach at penn ...
Sixth International Conference on Data Mining (ICDM'06), 2006
Interpersonal interaction plays an important role in organizational dynamics, and understanding t... more Interpersonal interaction plays an important role in organizational dynamics, and understanding these interaction networks is a key issue for any organization, since these can be tapped to facilitate various organizational processes. However, the approaches of collecting data about them using surveys/interviews are fraught with problems of scalability, logistics and reporting biases, especially since such surveys may be perceived to be intrusive. Widespread use of computer networks for organizational communication provides a unique opportunity to overcome these difficulties and automatically map the organizational networks with a high degree of detail and accuracy. This paper describes an effective and scalable approach for modeling organizational networks by tapping into an organization's email communication. The approach models communication between actors as nonstationary Bernoulli trials and Bayesian inference is used for estimating model parameters over time. This approach is useful for socio-cognitive analysis (who knows who knows who) of organizational communication networks. Using this approach, novel measures for analysis of (i) closeness between actors' perceptions about such organizational networks (agreement), (ii) divergence of an actor's perceptions about organizational network from reality (misperception) are explained. Using the Enron email data, we show that these techniques provide sociologists with a new tool to understand organizational networks.
This paper explains the classical social network analysis and discusses how computer networks eff... more This paper explains the classical social network analysis and discusses how computer networks effect a shift in constructing social networks. The paper then concentrates on analyzing cognitive aspects of a social network, explaining a simple but scalable approach for modeling a socio-cognitive network. Novel measures using such a socio-cognitive network model are defined and applications of such measures to extract useful information is illustrated on the Enron email dataset. The paper then describes a Dempster-Schafer theory based approach towards modeling a cognitive knowledge network and uses the Enron email dataset to illustrate how the proposed model can be used to capture actors' perceptions in a knowledge network. The paper concludes with a summary of the proposed models and a discussion on new research directions that can arise due to such cognitive analyses of electronic communication data. -"somewhere behind the formal organizational chart at Indsco was another shadow structure in which dramas of power were played out." Such social communities established via computer networks or not, play an important role in establishing communities of practice, which in turn serve as a basis for knowledge management in an organization. In a social network, individuals are usually influenced in their behavior by their psychosocial filter . Such a filter is affected by several factors like (i) Social confidence -composed of comfort zone (perception about self) and approachability (perceptions about others); (ii) Credibility -ability to separate useful knowledge from irrelevant knowledge; and (iii) Trustworthiness -personal knowledge is an important commodity which is shared only based on trust. Social networks may also negatively affect an organization, as they serve as a diffusion network for spreading rumors and other information that may be harmful for the organization.
People interact with each other for various reasons. Based on the purpose of the relationship, th... more People interact with each other for various reasons. Based on the purpose of the relationship, these interactions exhibit certain characteristics. One such important characteristic is that of concealment. Concealed relations can often be a source of interest especially in the domain of counterterrorism where relations fostering malicious activities tend to be secretive or concealed from the general public. In this paper we propose a technique for extracting concealed relations from social network data. The technique analyzes actors' perceptions regarding other actors' social interactions and requires that they can be constructed from the social network data. One popular communication medium for which this can be done efficiently is electronic mail. The proposed technique uses the popular and robust tf-idf measure from the information retrieval literature to quantify the concept of concealment. We present experimental results from the Enron email corpus.
Next Generation of Data Mining, Taylor and Francics, 2008
The field of social network analysis evolved from the need to understand social relationship and ... more The field of social network analysis evolved from the need to understand social relationship and interactions within a group of individuals. Knowing all individuals (employees) in an organization is difficult for an employee due to his/her limited bandwidth. Thus, in an organization's social network, not everyone directly knows (or interacts) with each other (Cross et al., 2002). Nor does an individual observe all the communication between individuals known (or unknown) to him/her directly. The result is that each individual forms ...
Workshop on Link Analysis, Counterterrorism and Security at SIAM Conference on Data Mining, 2006
Knowledge management in organizations is gaining in importance, especially with the advent of com... more Knowledge management in organizations is gaining in importance, especially with the advent of computer networks. Networks foster interaction between individuals, and have become the medium of choice for all types of interactions, both professional and social. In this research, we study the perception of knowledge in an organization's email network. An important aspect of an individual's knowledge is that it may be incomplete and hence any analysis approach must handle knowledge uncertainty. We propose an approach based ...
Abstract Advanced communication technologies enable strangers to work together on the same tasks ... more Abstract Advanced communication technologies enable strangers to work together on the same tasks or projects in virtual environments. Understanding the formation of task-oriented groups is an important first step to study the dynamics of team collaboration. In this paper, we investigated group combat activities in Sonypsilas EverQuest II game to identify the role of player and group attributes on group formation. We found that group formation is highly influenced by playerspsila common interests on challenging tasks.
Uploads
Papers by Nishith Pathak