Active Learning for Hidden Attributes in Networks

In many networks, vertices have hidden attributes that are correlated with the network’s topology. For instance, in social networks, people are more likely to be friends if they are demographically similar. In food webs, predators typically eat prey of lower body mass. We explore a setting in which...

Full description

Bibliographic Details
Main Authors: Xiao Ran Yan, Jean-baptiste Rouquier, Yao Jia Zhu, Cristopher Moore
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Subjects:
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.2081
http://www.cs.unm.edu/~treport/tr/10-01/paper-2010-02.pdf
id ftciteseerx:oai:CiteSeerX.psu:10.1.1.154.2081
record_format openpolar
spelling ftciteseerx:oai:CiteSeerX.psu:10.1.1.154.2081 2023-05-15T18:43:18+02:00 Active Learning for Hidden Attributes in Networks Xiao Ran Yan Jean-baptiste Rouquier Yao Jia Zhu Cristopher Moore The Pennsylvania State University CiteSeerX Archives application/pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.2081 http://www.cs.unm.edu/~treport/tr/10-01/paper-2010-02.pdf en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.2081 http://www.cs.unm.edu/~treport/tr/10-01/paper-2010-02.pdf Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://www.cs.unm.edu/~treport/tr/10-01/paper-2010-02.pdf text ftciteseerx 2016-01-07T15:27:07Z In many networks, vertices have hidden attributes that are correlated with the network’s topology. For instance, in social networks, people are more likely to be friends if they are demographically similar. In food webs, predators typically eat prey of lower body mass. We explore a setting in which the network’s topology is known, but these attributes are not. If each vertex can be queried, learning the value of its hidden attributes— but only at some cost—then we need an algorithm which chooses which vertex to query next, in order to learn as much as possible about the attributes of the remaining vertices. We assume that the network is generated by a probabilistic model, but we make no assumptions about the assortativity or disassortativity of the network. We then query the vertex with the largest mutual information between its type and that of the others (a well-known approach in active learning) or with the largest average agreement between two independent samples of the Gibbs distribution which agree on its type. We test these approaches on two networks with known attributes, the Karate Club network and a food web of species in the Weddell Sea. In several cases, we found that the average agreement algorithm performs better than mutual information. The algorithms appear to explore the network intelligently, first querying vertices at the centers of communities, and then querying vertices at the boundaries between communities. 1 Text Weddell Sea Unknown Weddell Weddell Sea
institution Open Polar
collection Unknown
op_collection_id ftciteseerx
language English
description In many networks, vertices have hidden attributes that are correlated with the network’s topology. For instance, in social networks, people are more likely to be friends if they are demographically similar. In food webs, predators typically eat prey of lower body mass. We explore a setting in which the network’s topology is known, but these attributes are not. If each vertex can be queried, learning the value of its hidden attributes— but only at some cost—then we need an algorithm which chooses which vertex to query next, in order to learn as much as possible about the attributes of the remaining vertices. We assume that the network is generated by a probabilistic model, but we make no assumptions about the assortativity or disassortativity of the network. We then query the vertex with the largest mutual information between its type and that of the others (a well-known approach in active learning) or with the largest average agreement between two independent samples of the Gibbs distribution which agree on its type. We test these approaches on two networks with known attributes, the Karate Club network and a food web of species in the Weddell Sea. In several cases, we found that the average agreement algorithm performs better than mutual information. The algorithms appear to explore the network intelligently, first querying vertices at the centers of communities, and then querying vertices at the boundaries between communities. 1
author2 The Pennsylvania State University CiteSeerX Archives
format Text
author Xiao Ran Yan
Jean-baptiste Rouquier
Yao Jia Zhu
Cristopher Moore
spellingShingle Xiao Ran Yan
Jean-baptiste Rouquier
Yao Jia Zhu
Cristopher Moore
Active Learning for Hidden Attributes in Networks
author_facet Xiao Ran Yan
Jean-baptiste Rouquier
Yao Jia Zhu
Cristopher Moore
author_sort Xiao Ran Yan
title Active Learning for Hidden Attributes in Networks
title_short Active Learning for Hidden Attributes in Networks
title_full Active Learning for Hidden Attributes in Networks
title_fullStr Active Learning for Hidden Attributes in Networks
title_full_unstemmed Active Learning for Hidden Attributes in Networks
title_sort active learning for hidden attributes in networks
url http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.2081
http://www.cs.unm.edu/~treport/tr/10-01/paper-2010-02.pdf
geographic Weddell
Weddell Sea
geographic_facet Weddell
Weddell Sea
genre Weddell Sea
genre_facet Weddell Sea
op_source http://www.cs.unm.edu/~treport/tr/10-01/paper-2010-02.pdf
op_relation http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.2081
http://www.cs.unm.edu/~treport/tr/10-01/paper-2010-02.pdf
op_rights Metadata may be used without restrictions as long as the oai identifier remains attached to it.
_version_ 1766233635699032064