Active Learning for Hidden Attributes in Networks
In many networks, vertices have hidden attributes that are correlated with the network’s topology. For instance, in social networks, people are more likely to be friends if they are demographically similar. In food webs, predators typically eat prey of lower body mass. We explore a setting in which...
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Text |
Language: | English |
Subjects: | |
Online Access: | http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.2081 http://www.cs.unm.edu/~treport/tr/10-01/paper-2010-02.pdf |
id |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.154.2081 |
---|---|
record_format |
openpolar |
spelling |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.154.2081 2023-05-15T18:43:18+02:00 Active Learning for Hidden Attributes in Networks Xiao Ran Yan Jean-baptiste Rouquier Yao Jia Zhu Cristopher Moore The Pennsylvania State University CiteSeerX Archives application/pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.2081 http://www.cs.unm.edu/~treport/tr/10-01/paper-2010-02.pdf en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.2081 http://www.cs.unm.edu/~treport/tr/10-01/paper-2010-02.pdf Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://www.cs.unm.edu/~treport/tr/10-01/paper-2010-02.pdf text ftciteseerx 2016-01-07T15:27:07Z In many networks, vertices have hidden attributes that are correlated with the network’s topology. For instance, in social networks, people are more likely to be friends if they are demographically similar. In food webs, predators typically eat prey of lower body mass. We explore a setting in which the network’s topology is known, but these attributes are not. If each vertex can be queried, learning the value of its hidden attributes— but only at some cost—then we need an algorithm which chooses which vertex to query next, in order to learn as much as possible about the attributes of the remaining vertices. We assume that the network is generated by a probabilistic model, but we make no assumptions about the assortativity or disassortativity of the network. We then query the vertex with the largest mutual information between its type and that of the others (a well-known approach in active learning) or with the largest average agreement between two independent samples of the Gibbs distribution which agree on its type. We test these approaches on two networks with known attributes, the Karate Club network and a food web of species in the Weddell Sea. In several cases, we found that the average agreement algorithm performs better than mutual information. The algorithms appear to explore the network intelligently, first querying vertices at the centers of communities, and then querying vertices at the boundaries between communities. 1 Text Weddell Sea Unknown Weddell Weddell Sea |
institution |
Open Polar |
collection |
Unknown |
op_collection_id |
ftciteseerx |
language |
English |
description |
In many networks, vertices have hidden attributes that are correlated with the network’s topology. For instance, in social networks, people are more likely to be friends if they are demographically similar. In food webs, predators typically eat prey of lower body mass. We explore a setting in which the network’s topology is known, but these attributes are not. If each vertex can be queried, learning the value of its hidden attributes— but only at some cost—then we need an algorithm which chooses which vertex to query next, in order to learn as much as possible about the attributes of the remaining vertices. We assume that the network is generated by a probabilistic model, but we make no assumptions about the assortativity or disassortativity of the network. We then query the vertex with the largest mutual information between its type and that of the others (a well-known approach in active learning) or with the largest average agreement between two independent samples of the Gibbs distribution which agree on its type. We test these approaches on two networks with known attributes, the Karate Club network and a food web of species in the Weddell Sea. In several cases, we found that the average agreement algorithm performs better than mutual information. The algorithms appear to explore the network intelligently, first querying vertices at the centers of communities, and then querying vertices at the boundaries between communities. 1 |
author2 |
The Pennsylvania State University CiteSeerX Archives |
format |
Text |
author |
Xiao Ran Yan Jean-baptiste Rouquier Yao Jia Zhu Cristopher Moore |
spellingShingle |
Xiao Ran Yan Jean-baptiste Rouquier Yao Jia Zhu Cristopher Moore Active Learning for Hidden Attributes in Networks |
author_facet |
Xiao Ran Yan Jean-baptiste Rouquier Yao Jia Zhu Cristopher Moore |
author_sort |
Xiao Ran Yan |
title |
Active Learning for Hidden Attributes in Networks |
title_short |
Active Learning for Hidden Attributes in Networks |
title_full |
Active Learning for Hidden Attributes in Networks |
title_fullStr |
Active Learning for Hidden Attributes in Networks |
title_full_unstemmed |
Active Learning for Hidden Attributes in Networks |
title_sort |
active learning for hidden attributes in networks |
url |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.2081 http://www.cs.unm.edu/~treport/tr/10-01/paper-2010-02.pdf |
geographic |
Weddell Weddell Sea |
geographic_facet |
Weddell Weddell Sea |
genre |
Weddell Sea |
genre_facet |
Weddell Sea |
op_source |
http://www.cs.unm.edu/~treport/tr/10-01/paper-2010-02.pdf |
op_relation |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.2081 http://www.cs.unm.edu/~treport/tr/10-01/paper-2010-02.pdf |
op_rights |
Metadata may be used without restrictions as long as the oai identifier remains attached to it. |
_version_ |
1766233635699032064 |