Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author-topic model #893

Merged
merged 103 commits into from
Jan 17, 2017
Merged

Conversation

olavurmortensen
Copy link
Contributor

@olavurmortensen olavurmortensen commented Sep 27, 2016

@tmylk @piskvorky

I'm implementing the author-topic (AT) model, which is based on "The Author-Topic Model for Authors and Documents" by Rosen-Zvi and co-authors. This project is in connection with my masters thesis at the Technical University of Denmark.

As indicated in the PR title, this is a work in progress.

The original paper presents an algorithm based on collapsed Gibbs sampling. I have derived a variational Bayes algorithm to trained this model instead, and implemented that algorithm. Furthermore, I have made an online algorithm using the method described in "Online Learning For Latent Dirichlet Allocation" by Hoffman and co-authors.

At the moment, the algorithm runs and seems to converge. I intend to run some experiments to see if the resulting topics are good soon. I expect to find that I need to do some improvements, but we will see.

@olavurmortensen olavurmortensen changed the title Author topic model [WIP] Author-topic model [WIP] Sep 27, 2016
@tmylk tmylk changed the title Author-topic model [WIP] [WIP] Author-topic model Sep 27, 2016
@tmylk tmylk added feature Issue described a new feature difficulty hard Hard issue: required deep gensim understanding & high python/cython skills labels Sep 27, 2016
…ent likelihood measure. OnlineAtVb now extends (inherits) LdaModel. Other minor changes.
… use of log_normalize in offline algo. Update notebook.
…speed up large experiments. Made it possible to initialize the model with LDA topics (lambda).
@@ -836,24 +834,28 @@ def bound(self, chunk, chunk_doc_idx=None, subsample_ratio=1.0, author2doc=None,
if not self.author2doc.get(a):
raise ValueError('bound cannot be called with authors not seen during training.')

chunk_doc_idx = xrange(len(chunk))
if chunk_doc_idx:
raise ValueError('Either author dictionaries or chunk_doc_idx must be prodivded, not both. Consult documentation of bound method.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*typo in provided

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -0,0 +1,2097 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a part of the codebase? it is a good regression performance test. Could you keep it as a gist elsewhere and keep a link to it in the atmodel.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't going to include it at all, it's just easiest for me to keep it there while developing. I have now removed it from the repo.

self.numworkers = 1
else:
# NOTE: distributed processing is not implemented for the author-topic model.
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise a not implemented exception to be explicit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why I put that there, distributed=False is set explicitly (it is not an input). So now I just do the following:

distributed = False
self.dispatcher = None
self.numworkers = 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also a comment above that about implementing a distributed version (should be line 209).

if minimum_probability is None:
minimum_probability = self.minimum_probability

# NOTE: this is used in LdaModel:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this comment mean? please expand

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well. I was really sceptical about doing that. But now I just decided to do it anyway. So the note is removed and the line is uncommented (so it is doing what LdaModel is doing now).

# Licensed under the GNU LGPL v2.1 - http://www.gnu.org/licenses/lgpl.html

"""
Automated tests for checking transformation algorithms (the models package).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update the docstring

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@tmylk tmylk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixmes in ipynb

# Perhaps test that the bound increases, in general (i.e. in several of the tests below where it makes
# sense.

# FIXME: remember to remove this, once done using it:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -0,0 +1,1182 @@
{
"cells": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiple fixmes in this ipynb need to be resolved before merging

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All fixmes removed.

…s well: removed do_mstep method (using LdaModels version directly), using minimum_probability in get_author_topics, removed statement (in log) that said perplexity is evaluated on held-out data.
@olavurmortensen
Copy link
Contributor Author

Tutorial is done. @tmylk please review.

Copy link
Contributor

@tmylk tmylk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ipynb minor changes

"* Pre-processing and training LDA: https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/lda_training_tips.ipynb\n",
"\n",
"\n",
"> **NOTE:**\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just give the pip commands

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"\n",
"As in the LDA tutorial, we will be performing qualitative analysis of the model, and at times this will require an understanding of the subject matter of the data. If you try running this tutorial on your own, consider applying it on a dataset with subject matter that you are familiar with. For example, try one of the [StackExchange datadump datasets](https://archive.org/details/stackexchange).\n",
"\n",
"You can download the data from Sam Roweis' website (http://www.cs.nyu.edu/~roweis/data.html).\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just add a wget cell please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"source": [
"#### Plotting the authors\n",
"\n",
"Now we're going to produce the kind of pacific archipelago looking plot below. The goal of this plot is to give you a way to explore the author-topic representation in an intuitive manner.\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create a blog on wordpress linking to this ipynb with this blue graph as the title image.

" var inline_js = [\n",
" function(Bokeh) {\n",
" Bokeh.$(function() {\n",
" var docs_json = {\"e64bc00d-c5e8-48c6-85d7-9d719e821b4d\":{\"roots\":{\"references\":[{\"attributes\":{\"fill_alpha\":{\"value\":0.6},\"fill_color\":{\"value\":\"#1f77b4\"},\"line_color\":{\"value\":null},\"radius\":{\"field\":\"radii\",\"units\":\"data\"},\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y\"}},\"id\":\"0cb98164-2232-4447-a4b5-7c515c3f0dfd\",\"type\":\"Circle\"},{\"attributes\":{\"bottom_units\":\"screen\",\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"left_units\":\"screen\",\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"render_mode\":\"css\",\"right_units\":\"screen\",\"top_units\":\"screen\"},\"id\":\"a140a9ad-34dc-4a95-9824-7d2c71f5f40d\",\"type\":\"BoxAnnotation\"},{\"attributes\":{},\"id\":\"25df7d2f-cd83-424f-8959-3cff8b4f9b23\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{\"plot\":{\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"80b48167-b15d-4e4b-84db-f2ce63ed0e5c\",\"type\":\"BasicTicker\"}},\"id\":\"4d8ea3f6-76e1-4e3c-b7c9-7d5f2e4a163f\",\"type\":\"Grid\"},{\"attributes\":{\"callback\":null},\"id\":\"e1e4ea7e-aec1-4c9f-a0be-016ccbc3eaa4\",\"type\":\"DataRange1d\"},{\"attributes\":{\"formatter\":{\"id\":\"25df7d2f-cd83-424f-8959-3cff8b4f9b23\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"80b48167-b15d-4e4b-84db-f2ce63ed0e5c\",\"type\":\"BasicTicker\"}},\"id\":\"50a85230-5c72-4045-ac14-881f6f307baf\",\"type\":\"LinearAxis\"},{\"attributes\":{},\"id\":\"ea6a8c94-8c7d-4c1e-a218-083495e53e2b\",\"type\":\"ToolEvents\"},{\"attributes\":{\"plot\":{\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"7e83a056-f89c-4f03-bbde-aa8af0208811\",\"type\":\"ResetTool\"},{\"attributes\":{\"plot\":{\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"7f19a59b-e39a-4e87-811d-f0552f9a90af\",\"type\":\"WheelZoomTool\"},{\"attributes\":{\"callback\":null,\"plot\":{\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"tooltips\":[[\"author\",\"@author_names\"],[\"size\",\"@author_sizes\"]]},\"id\":\"a2e961ed-115b-4d7d-8905-c960988fa9f0\",\"type\":\"HoverTool\"},{\"attributes\":{},\"id\":\"1fb28c9f-40b7-40b9-be69-1b7e44039156\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{\"callback\":null,\"overlay\":{\"id\":\"afcc2163-43b4-4522-8f27-eec835514ddc\",\"type\":\"PolyAnnotation\"},\"plot\":{\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"bdad9ad5-95a7-4dd7-b3c3-5d040340d9a5\",\"type\":\"LassoSelectTool\"},{\"attributes\":{\"active_drag\":\"auto\",\"active_scroll\":\"auto\",\"active_tap\":\"auto\",\"tools\":[{\"id\":\"a2e961ed-115b-4d7d-8905-c960988fa9f0\",\"type\":\"HoverTool\"},{\"id\":\"936102af-cadc-4ada-880c-1ae1e1f451ed\",\"type\":\"CrosshairTool\"},{\"id\":\"9651b508-3ce2-4350-891e-aca5a9d9c0fe\",\"type\":\"PanTool\"},{\"id\":\"7f19a59b-e39a-4e87-811d-f0552f9a90af\",\"type\":\"WheelZoomTool\"},{\"id\":\"9e509916-8395-4d8a-b781-f9a92b207e84\",\"type\":\"BoxZoomTool\"},{\"id\":\"7e83a056-f89c-4f03-bbde-aa8af0208811\",\"type\":\"ResetTool\"},{\"id\":\"c0e30173-6a62-439d-afdd-cbacc8447f65\",\"type\":\"SaveTool\"},{\"id\":\"bdad9ad5-95a7-4dd7-b3c3-5d040340d9a5\",\"type\":\"LassoSelectTool\"}]},\"id\":\"d5adedf2-aede-40d1-8932-4b3686e6d1fb\",\"type\":\"Toolbar\"},{\"attributes\":{},\"id\":\"b1533272-adf8-4cd1-8b85-a38220ef2199\",\"type\":\"BasicTicker\"},{\"attributes\":{\"below\":[{\"id\":\"50a85230-5c72-4045-ac14-881f6f307baf\",\"type\":\"LinearAxis\"}],\"left\":[{\"id\":\"b23536e4-ee8e-45b2-b1b7-8c23a21ebdc6\",\"type\":\"LinearAxis\"}],\"renderers\":[{\"id\":\"50a85230-5c72-4045-ac14-881f6f307baf\",\"type\":\"LinearAxis\"},{\"id\":\"4d8ea3f6-76e1-4e3c-b7c9-7d5f2e4a163f\",\"type\":\"Grid\"},{\"id\":\"b23536e4-ee8e-45b2-b1b7-8c23a21ebdc6\",\"type\":\"LinearAxis\"},{\"id\":\"d9a2f81a-6ad0-494d-b9b2-519d5466bf55\",\"type\":\"Grid\"},{\"id\":\"a140a9ad-34dc-4a95-9824-7d2c71f5f40d\",\"type\":\"BoxAnnotation\"},{\"id\":\"afcc2163-43b4-4522-8f27-eec835514ddc\",\"type\":\"PolyAnnotation\"},{\"id\":\"69fd2b1b-2091-4ca4-b58d-758329ef159c\",\"type\":\"GlyphRenderer\"}],\"title\":{\"id\":\"ce0bfd0a-0b2b-4e69-b780-4a290b62b2a3\",\"type\":\"Title\"},\"tool_events\":{\"id\":\"ea6a8c94-8c7d-4c1e-a218-083495e53e2b\",\"type\":\"ToolEvents\"},\"toolbar\":{\"id\":\"d5adedf2-aede-40d1-8932-4b3686e6d1fb\",\"type\":\"Toolbar\"},\"x_range\":{\"id\":\"e1e4ea7e-aec1-4c9f-a0be-016ccbc3eaa4\",\"type\":\"DataRange1d\"},\"y_range\":{\"id\":\"69670d03-aa44-47b5-857f-83ad85f61c02\",\"type\":\"DataRange1d\"}},\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"},{\"attributes\":{\"dimension\":1,\"plot\":{\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"b1533272-adf8-4cd1-8b85-a38220ef2199\",\"type\":\"BasicTicker\"}},\"id\":\"d9a2f81a-6ad0-494d-b9b2-519d5466bf55\",\"type\":\"Grid\"},{\"attributes\":{\"callback\":null,\"column_names\":[\"author_sizes\",\"x\",\"author_names\",\"radii\",\"y\"],\"data\":{\"author_names\":[\"O.I.Tsioutsias\",\"MichaelHumphreys\",\"MatthewA.Wilson\",\"RyotaroKamimura\",\"KariTorkkola\",\"DavidFeld\",\"ThomasClare\",\"HervdBourlard\",\"BernhardSchottky\",\"SvilenTzonev\",\"SatoshiYamada\",\"GuyJ.Brown\",\"RobertSnapp\",\"R.R.de-Ruyter-van-Steveninck\",\"JeremyFrank\",\"MiriamSchulte\",\"DiegoSona\",\"MichaelKearns\",\"TakaoWatanabe\",\"MarcH.Cohen\",\"AjayGupta\",\"AlessandroSperduti\",\"SmartGeman\",\"T.Nakai\",\"S.Schaal\",\"J.Walter\",\"BrianRasnow\",\"GaryM.Scott\",\"S.C.Ahalt\",\"ChristopherG.Atkeson\",\"HiroakiGomi\",\"KevinA.Archie\",\"Te-WonLee\",\"DonaldB.Malkoff\",\"EnricoBocchieri\",\"RonaldH.Silverman\",\"GregMartin\",\"MishaPavel\",\"I.Jouny\",\"ToshiakiOkamoto\",\"DavidAndre\",\"A.H.L.West\",\"PeterFoltz\",\"J.Deppisch\",\"ParthaNiyogi\",\"AapoHyvarrinen\",\"AntoninaStarita\",\"DanielPotter\",\"Jen-LunYuan\",\"PeterBartlett\",\"RichardJ.Coggins\",\"HarrisDrucker\",\"StefanoMonti\",\"DavidMarson\",\"GaleL.Martin\",\"MosheKaro\",\"S.SidneyFels\",\"JoseA.B.Fortes\",\"YoramSinger\",\"SatinderSingh\",\"GideonDror\",\"L.Xu\",\"HidemitsuOgawa\",\"HowardHenry\",\"MatthewJ.Beal\",\"HermanVerrelst\",\"ZoubinGhahramani\",\"D.S.Touretzky\",\"AvijitSaha\",\"RandallR.Spangler\",\"DaphneBavelier\",\"YvesGrandvalet\",\"MichaelDuff\",\"DavidA.Kessler\",\"JosephB.Keller\",\"RonaldG.Benson\",\"AlanJ.Harget\",\"RichardK.Belew\",\"MarkGluck\",\"A.B.Bonds\",\"D.D.Coon\",\"OleWinther\",\"JohnKruschke\",\"CatherineE.Myers\",\"EricI.Knudsen\",\"AmirDembo\",\"ChristophE.Schreiner\",\"MalcolmSlaney\",\"AndreStechert\",\"AliH.Sayed\",\"L.Y.Pratt\",\"AndrewBlake\",\"EricVittoz\",\"MarioBlaum\",\"AdamJ.Grove\",\"A.Horst\",\"J.Hajto\",\"YosefRinott\",\"MichaelFleisher\",\"AhChungTsoi\",\"KanBoonyanit\",\"Andrevan-Schaik\",\"HiroyukiNakahara\",\"ToddSoukup\",\"ThorsteinnS.Rognvaldsson\",\"B.Flower\",\"SrinageshSatyanarayana\",\"JohnBain\",\"MichaelS.Gray\",\"PeterRappelsberger\",\"MartinI.Sereno\",\"AndrewMoore\",\"JesperVedelsby\",\"J.A.F.Leite\",\"T.Maxwell\",\"ConradC.Galland\",\"MichaelM.Merzenich\",\"P.Leong\",\"MarkPlutowski\",\"TimHoriuchi\",\"C.Koch\",\"H.G.Zimmermann\",\"D.S.Tang\",\"DavidG.Ward\",\"JonathanBaxter\",\"JoachimUtans\",\"CyrilLatimer\",\"PeterAdorjan\",\"M.Gilloux\",\"PatrickMoore\",\"RonaldL.Calabrese\",\"TheoGeisel\",\"A.Gersho\",\"FabioSolari\",\"A.Sangiovanni-Vincentelli\",\"R.M.Borisyuk\",\"KenjiMatsumoto\",\"WesleyE.Snyder\",\"IraG.Smotroff\",\"GeoffreyE.Hinton\",\"EmanuelaBricolo\",\"H.U.Bauer\",\"A.MiguelSanMartin\",\"JeffreyEMonaco\",\"KevinE.Martin\",\"TimothyChiu\",\"PenttiKanerva\",\"DanielM.Wolpert\",\"JamesR.Williamson\",\"MartinJ.Johnson\",\"B.G.Home\",\"NadaE.Matic\",\"UriRokni\",\"DavidScheeff\",\"EdgarA.Brown\",\"RandallD.Beer\",\"S.Liu\",\"JoumanaGhosn\",\"A.Pentland\",\"BrendaClaiborne\",\"Bertde-Vries\",\"TadahiroOhmi\",\"J.C.Pearson\",\"DeirdreW.Wheeler\",\"H.H.Chen\",\"Benjaminvan-Roy\",\"SylvieRyckebusch\",\"ChristopherBowman\",\"StevenA.Harp\",\"A.Moopenn\",\"E.Littmann\",\"AndreasStolcke\",\"R.Sitaramen\",\"GeoffreyGoodhill\",\"AthanasiosG.Tsirukis\",\"EricChang\",\"W.FritzKruger\",\"A.vanSchaik\",\"PadhraicSmyth\",\"DavidS.Touretzky\",\"AlanBarr\",\"ThomasH.Brown\",\"AlanF.Murray\",\"LouiseOsterholtz\",\"B.V.K.VijayaKumar\",\"GerardDreyfus\",\"JohnK.Williams\",\"EricB.Baum\",\"MichaelIsard\",\"ChristopherAtkeson\",\"LanceM.Optican\",\"GeoffreyOrsak\",\"DavidA.Robinson\",\"JohnBaras\",\"AmitManwani\",\"AnthonyLaVigna\",\"MichaelI.Jordan\",\"ShellyGoggin\",\"M.A.Jabri\",\"JamesK.Peterson\",\"EytanDomany\",\"RobertE.Schapire\",\"J.Beck\",\"MarwanJabri\",\"ToddS.Braver\",\"UryNaftaly\",\"M.S.Bartlett\",\"RonaldA.Cole\",\"EHergert\",\"G.G.Blasdel\",\"JessicaD.Bayliss\",\"ThomasPetsche\",\"H.Pan\",\"GarethJames\",\"IsaacMeilijson\",\"MazinRahim\",\"W.R.Gardner\",\"DavidMontana\",\"S.Baluja\",\"RichardGolden\",\"AlbertoSangiovanni-Vincentelli\",\"KennethA.Norman\",\"ParryHusbands\",\"RichardFozzard\",\"NevinL.Zhang\",\"JacquesGautrals\",\"EyalCohen\",\"T.Mitchell\",\"ScottKirkpatrick\",\"G.Dreyfus\",\"BarbaraKlein\",\"LionelTarassenko\",\"L.D.Jackel\",\"CharlesL.Isbell\",\"StephenJ.Hanson\",\"AdamPrtigel-Bennett\",\"M.Mahowald\",\"RichardLyon\",\"HeikoNeumann\",\"BartlettW.Mel\",\"BerndFritzke\",\"MosheSipper\",\"RicardoA.MarquesPereira\",\"MichaelSeibert\",\"S.Thrun\",\"J.L.Elman\",\"T.Hastie\",\"JohnE.W.Mayhew\",\"AchimStahlberger\",\"EricMjolsness\",\"HeinzSchuster\",\"ChrisJ.C.Burges\",\"MarioMarchand\",\"GeoffreyFox\",\"Meng-JangLin\",\"PaulEkman\",\"ColinHumphties\",\"GennadyS.Cymbalyuk\",\"DonnieHenderson\",\"C.A.Micchelli\",\"MarwanJabd\",\"JackL.Meador\",\"RaoulTawel\",\"AlirezaKhotanzad\",\"ThomasRagg\",\"VolkerRoth\",\"A.Sergejew\",\"HeinrichH.Btilthoff\",\"NicholasR.Howe\",\"DougJohnson\",\"SantoshS.Venkatesh\",\"AlanEMurray\",\"C.M.Bishop\",\"JaakkoHollmen\",\"D.Chen\",\"GeoffreyTowell\",\"CristophBregler\",\"StevenNowlan\",\"DavidSomers\",\"BjomLambrigsten\",\"JamesJ.Knierim\",\"YiLi\",\"TimothyW.Cacciatore\",\"D.M.Titterington\",\"SatoruShiono\",\"ChienPingLu\",\"CraigT.Jin\",\"AndrewW.Moore\",\"A.J.Bell\",\"VwaniRoychowdhury\",\"L.F.Abbott\",\"B.Parmanto\",\"StephenChurcher\",\"N.Toomarian\",\"NaftaliTishby\",\"StephenPiche\",\"P.S.Bradley\",\"ArmandoManduca\",\"NeilLawrence\",\"EricCourchesne\",\"AntonGunzinger\",\"AnthonyV.W.Smith\",\"ChristopherJ.Merz\",\"O.Miller\",\"AndrewM.Finch\",\"PeterTifio\",\"H.S.Baird\",\"C.Stevens\",\"A.Afghan\",\"MarianS.Bartlett\",\"MarkA.Rubin\",\"NiallMcLoughlin\",\"CharlesElbaurn\",\"DanielM.Kammen\",\"DominikHornel\",\"JohnPlatt\",\"JoeTebelskis\",\"ToddK.Leen\",\"AlfonsoRenart\",\"M.Kearns\",\"GenevieveB.Orr\",\"LeemonBaird\",\"JimChristian\",\"VicenteHonrubia\",\"GaryBradshaw\",\"J.Hertz\",\"DmitriB.Chklovskii\",\"Tzi-DarChiueh\",\"AlbertoBertoni\",\"ShimonEdelman\",\"KwokFaiHui\",\"MarkOllila\",\"DavidL.Bisset\",\"ManfredK.Warmuth\",\"PaatRusmevichiemong\",\"J.Sirosh\",\"T.Petsehe\",\"SteliosM.Smimakis\",\"DanRoth\",\"LanceR.Williams\",\"RuthJ.Williams\",\"PietroPerona\",\"AssafJ.Zeevi\",\"N.H.Wulff\",\"T.Rebotier\",\"KlausPrank\",\"R.K.Alley\",\"R.Gourley\",\"PaulViola\",\"BhaskarDasGupta\",\"Andrfivan-Schaik\",\"DavidMarsan\",\"MarkusSvensen\",\"A.Sato\",\"TobiasMann\",\"NelloCristianini\",\"YoavFreund\",\"MarthaFarah\",\"FranklinJ.Rudolph\",\"PeterKazlas\",\"G.Jackson\",\"JimKeeler\",\"HaimSompolinsky\",\"SophieDeneve\",\"CarreitSahar-Pikielny\",\"MarioP.Vecchi\",\"HansP.Graf\",\"J.Larsen\",\"AnthonyJayakumar\",\"LeonardG.C.Hamey\",\"JamesBower\",\"TakashiOnoda\",\"YuzoHirai\",\"E.Mjolsness\",\"MarkSaffman\",\"DavidE.VandenBout\",\"GfintherPalm\",\"GianlucaBontempi\",\"W.Ross\",\"Y.Cboe\",\"G.L.Martin\",\"Y.Zhao\",\"ShaiFine\",\"HagaiAttias\",\"JohnE.Hogden\",\"VolkerTresp\",\"YoshiroMiyata\",\"Chuan-LinWu\",\"DanielS.Clouse\",\"JohnMakhoul\",\"PhilippeO.Pouliquen\",\"Ting-ChuenPong\",\"NirFriedman\",\"W.Hubbard\",\"ThomasAnastasio\",\"KristRoginski\",\"PhilipM.Long\",\"JamesGlynn\",\"VictorZue\",\"DavidServan-Schreiber\",\"FrankWilczek\",\"A.G.Barto\",\"DietrichWettschereck\",\"C.E.Schreiner\",\"ChrisM.Bishop\",\"YasuhiroWada\",\"A.P.Thakoor\",\"JoshuaTenenbaum\",\"I.Guyon\",\"JeffMellstrom\",\"R.Etienne-Cummings\",\"Ki-ChulKim\",\"BrunoCessac\",\"G.Cauwenberghs\",\"FrancisQuek\",\"B.V.Roy\",\"ThomasG.Edwards\",\"JurgenHollatz\",\"PhillipAlvelda\",\"A.N.Michel\",\"DianeLitman\",\"AndreJ.Noest\",\"JosephCollard\",\"PaulDean\",\"GianlucaDonato\",\"JackGelfand\",\"EnnioMingolla\",\"Z.Chi\",\"MichaelG.Dyer\",\"PaoloGaudiano\",\"NigelDuffy\",\"C.F.Beckmann\",\"EsterLevin\",\"Y.Konig\",\"L.C.Parra\",\"JoshuaChover\",\"R.E.Jenkins\",\"AkitoSakurai\",\"GalChechik\",\"EricVatikiotis-Bateson\",\"EdwardW.Kairiss\",\"JoaoF.G.de-Freitas\",\"S.Yu\",\"W.ScottStornetta\",\"B.Yuhas\",\"IlSongHan\",\"NormanYarvin\",\"D.P.Helmbold\",\"JosefZihl\",\"PaulM.Chau\",\"FerdinandoMussa-lvaldi\",\"AjayN.Jain\",\"RobertB.Darling\",\"HemantS.Kudrimoti\",\"HarveyKasdan\",\"TommiJaakkola\",\"DeLiangWang\",\"MichaelChuang\",\"L.C.Dixon\",\"PrahladGupta\",\"A.W.Moore\",\"JohnPearson\",\"FlorisTakens\",\"LloydWatts\",\"P.N.Sabes\",\"PatrickAgin\",\"LesAtlas\",\"KurtFleischer\",\"R.Miikkulainen\",\"M.Marchand\",\"DarkoStefanovic\",\"RichardG.M.Morris\",\"RaphaelFeraud\",\"A.Kowalczyk\",\"DavidTouretzky\",\"W.ThomasMiller\",\"RaymondL.Watrous\",\"SmartRussell\",\"A.M.Annaswamy\",\"JosephPolifroni\",\"EdwardReitman\",\"RonaldSverdlove\",\"ChristianLebiere\",\"DavidNix\",\"MarkZlochin\",\"MarkKvale\",\"DavidHaussler\",\"RobertoPieraccini\",\"FangyuGao\",\"StephenP.DeWeerth\",\"ChongGu\",\"EdwardSchwartz\",\"ChristianeLinster\",\"SteveWaterhouse\",\"MichaelGasser\",\"MichaelBrownlow\",\"GeraldSommer\",\"Mohammed-AbdelGhani\",\"LaurentItti\",\"LauranceT.Maloney\",\"D.S.C.So\",\"AmirF.Atiya\",\"EduardoD.Sontag\",\"JosefZeitlhofer\",\"H.R.Doyle\",\"DanielD.Lee\",\"M.W.Pealersen\",\"IdoKanter\",\"C.E.Rasmussen\",\"RichardZemel\",\"H.Bolouri\",\"StephenM.Omohundro\",\"JackL.Gallant\",\"MishaMahowald\",\"DavidRogers\",\"LeifH.Finkel\",\"PeterMarbach\",\"C.L.Fry\",\"PeterF.Rowat\",\"JordanPollack\",\"E.Domany\",\"Mark.RSydorenko\",\"MiguelA.Carreira-Perpinan\",\"DavidLowe\",\"AndrewBack\",\"MaryTabasko\",\"R.Janow\",\"Ming-HsuanYang\",\"KamilA.Grajski\",\"JoseAmbros-Ingerson\",\"J.D.Cowan\",\"ErikD.Lumer\",\"EdwardStern\",\"C.Kenyon\",\"C.J.Wellekens\",\"DeanPomerleau\",\"G.Indiveri\",\"NorbertoM.Grzywacz\",\"StephenG.Lisberger\",\"JamesM.Goodwin\",\"RichardJ.Mammone\",\"J.Baxter\",\"LeonN.Cooper\",\"StephenCox\",\"SubutaiAhmad\",\"CarlE.Rasmussen\",\"MikeSchuster\",\"LarryYaeger\",\"E.Erwin\",\"M.J.Rose\",\"KukjinKang\",\"R.Zecchina\",\"LeonidKruglyak\",\"RafaelMalach\",\"P.Stone\",\"MichelCrepon\",\"JianfengFeng\",\"MartineNaillon\",\"E.Ersu\",\"VerenaHebler\",\"K.VenkateshPrasad\",\"AlexPentland\",\"RogerCheng\",\"PavelLaskov\",\"JoachimBuhmann\",\"FransM.Coetzee\",\"D.Sherrington\",\"AnandRangarajan\",\"JosefSkrzypek\",\"AndrewR.Barron\",\"MichaelJ.Pazzani\",\"UsamaFayyad\",\"DanielKammen\",\"E.Zohary\",\"AdAertsen\",\"J.Alspector\",\"M.Blatt\",\"J.C.Jackson\",\"JohnKolen\",\"BarakPearlmutter\",\"JimSchimert\",\"DimitriBertsekas\",\"MarkusSchenkel\",\"DavidHelmbold\",\"HisashiSuzuki\",\"Jean-PierreNadal\",\"HananDavidowitz\",\"EduardSackinger\",\"ClaudineMasson\",\"KahKaySung\",\"AndreElisseeff\",\"DeanBrettle\",\"R.ChristopherdeCharms\",\"StevenS.Watkins\",\"D.Brandeis\",\"KevinR.Wheeler\",\"StephenOmohundro\",\"DavidPrice\",\"DaweiDong\",\"AlanH.Barr\",\"H.Yang\",\"MauriceLee\",\"TomHeskes\",\"ByronDom\",\"JeffreyR.LaFranchise\",\"XavierBoyen\",\"K.Y.MichaelWong\",\"G.Zavaliagkos\",\"SheriL.Gish\",\"AnyaC.Hurlbert\",\"VirginiaR.de-Sa\",\"BernardDoyon\",\"HenrikBohr\",\"HongLeung\",\"F.B.Rodriguez\",\"ShigeruTanaka\",\"AlexanderT.Ihler\",\"RichardP.Lippmann\",\"RichardO.Duda\",\"KevinJ.Moon\",\"JohnA.Hertz\",\"HarryPrintz\",\"TimothyS.Wilkinson\",\"MichaelCohen\",\"VictorAbrash\",\"JohnMoody\",\"AmnonShashua\",\"KathrynLaskey\",\"BalazsKegl\",\"HilbertJ.Kappen\",\"MauriceMilgram\",\"A.Zador\",\"FrankMoss\",\"AliceM.Chiang\",\"LiuKe\",\"VitalyMaiorov\",\"M.Finke\",\"J.Bernasconi\",\"RajeshRao\",\"NoboruMumta\",\"SteveRenals\",\"J.W.Shavlik\",\"N.E.Berthier\",\"F.Ohl\",\"E.Vittoz\",\"MatthiasBurger\",\"BarnbangParmanto\",\"JamesA.Simmons\",\"AkiraHayashi\",\"AlexP.Pentland\",\"AndreasZiehe\",\"DavidGoodine\",\"YurikoOshima-Takane\",\"EliShamir\",\"CarlosMejia\",\"TonyRobinson\",\"AndrewS.Noetzel\",\"ColinCampbell\",\"JohnH.Holland\",\"B.A.Pearlmutter\",\"JofioF.G.de-Freitas\",\"V.I.Makarenko\",\"H.Wagner\",\"Jean-FranqoisIsabelle\",\"A.Jagota\",\"DanielRuderman\",\"RodolfoMilito\",\"KimmoKiviluoto\",\"J.Kivinen\",\"KurtHornik\",\"TomM.Mitchell\",\"MohammadA.Al-Ansari\",\"JohnKassebaum\",\"ManoelF.Tenorio\",\"BarbaraWold\",\"AlexanderDimitrov\",\"JohnLoch\",\"LynetteHirschman\",\"DavidA.Cohn\",\"H.A.Rowley\",\"M.Cekic\",\"JohnHertz\",\"JonTombs\",\"LorienY.Pratt\",\"GeraldTesauro\",\"A.Pouget\",\"DemetriTerzopoulos\",\"JustinianRosca\",\"KaganTumer\",\"MichaelE.Hasselmo\",\"G.J.Zelinsky\",\"M.Wattenberg\",\"LeonS.Sterling\",\"B.Lemarie\",\"StevenBradtke\",\"J.PNadal\",\"ChristopheAndrieu\",\"T.Serrano-Gotarredona\",\"S.A.Macy\",\"JohnJ.Hopfield\",\"T.Jung\",\"C.M.Marcus\",\"GrigorisKarakoulas\",\"ZhaopingLi\",\"RobertMoll\",\"ZoranObradovic\",\"BillHorne\",\"DeirdreWheeler\",\"P.Koiran\",\"RodneyCotterill\",\"GaryG.Blasdel\",\"T.Kanade\",\"AkioTanaka\",\"EricA.Wan\",\"R.Schwartz\",\"JorgKindermann\",\"DavidJ.Foster\",\"YannisTsividis\",\"ManceHarmon\",\"CazhaowS.Qazaz\",\"SayandevMukherjee\",\"HinrichSchfitze\",\"DavidHorn\",\"GeoffreyJ.Goodhill\",\"H.Bourlard\",\"VladimirN.Vapnik\",\"NirmalaRamanujam\",\"MarkGerules\",\"DaphneKoller\",\"KarlGustafson\",\"PhilKohn\",\"D.J.Kershaw\",\"Hung-LiTseng\",\"SylvieRenaud-LeMasson\",\"AmirAtiya\",\"ShinIshii\",\"JamesHendler\",\"P.Sollich\",\"JohnW.Miller\",\"HilaryTunley\",\"JeffreyTeeters\",\"RichardGranger\",\"BradleyA.Minch\",\"GeorgeZavaliagkos\",\"IgorGrebert\",\"J.G.Taylor\",\"RodericGinpen\",\"AlexanderSinger\",\"AllenI.Selverston\",\"MarkA.Gluck\",\"PeterN.Steinmetz\",\"GerhardRigoll\",\"LauraMartignon\",\"JohnO'Keefe\",\"J.VanderSpiegel\",\"JimMann\",\"YochaiKonig\",\"JeremyS.de-Bonet\",\"EricSchnell\",\"Robde-Ruyter-van-Steveninck\",\"N.N.Schraudolph\",\"FrankEeckman\",\"PekkaOrponen\",\"HermanFerra\",\"FernandoLozano\",\"AnthonyBloesch\",\"RichardKempter\",\"HenriAtlan\",\"SandeepGulati\",\"Dit-YanYeung\",\"E.D.Sontag\",\"MartinHammer\",\"JohnK.Douglass\",\"DaleSchuurmans\",\"YoshioTakane\",\"R.Caruana\",\"JenniferLund\",\"BenS.Wittner\",\"NicholasRoy\",\"MichaelP.Stryker\",\"PaoloFrasconi\",\"B.Boser\",\"MarvinLuttges\",\"SteveGyger\",\"DavidA.Nix\",\"J.A.Farrell\",\"JohnKennedy\",\"EricSaund\",\"FouadBadran\",\"JavierR.Movellan\",\"H.Ritter\",\"StevenJ.Nowlan\",\"C.J.C.H.Watkins\",\"L.Wu\",\"SebastianMika\",\"JosephSill\",\"RonaldCole\",\"Hwang-SooLee\",\"Ch.Tietz\",\"S.Gold\",\"D.Lippe\",\"AndresRodriguez\",\"ChristofSchofl\",\"BernhardScholkopf\",\"PedroA.d.F.R.Hojen-Sorensen\",\"YoshiyukiKabashima\",\"HarrisonMonFookLeong\",\"Jung-WookCho\",\"HenrikFredholm\",\"MoiseH.Goldstein\",\"LouisCeci\",\"LawrenceD.Jackel\",\"AnthonyZador\",\"CharlesFefferman\",\"AnnaCorderoy\",\"Alexandervon-zur-Muhlen\",\"RonaldJ.Williams\",\"GeorgSchnitger\",\"RonPapka\",\"BernardVictorri\",\"RobertFrye\",\"JosePrincipe\",\"E.Majani\",\"S.Finch\",\"ChristopherAssad\",\"T.Duong\",\"T.Ohmi\",\"Ming-TakLeung\",\"EeroP.Simoncelli\",\"R.C.Williamson\",\"AndreasG.Andreou\",\"EduardoSontag\",\"M.M.Hochberg\",\"DavidShahian\",\"GadiPinkas\",\"StefanKnerr\",\"IanParberry\",\"SebastianThmn\",\"VijayR.Konda\",\"JensKohlmorgen\",\"DavidWarland\",\"AndrewR.Webb\",\"GeoffreyHinton\",\"RolfEckmiller\",\"RuthErlanson\",\"HansHenrikThodberg\",\"FredRieke\",\"D.T.Lawrence\",\"JohnLazzaro\",\"StephenPickard\",\"ViktorGruev\",\"P.R.Montague\",\"Wan-PingChiang\",\"BarakA.Pearlmutter\",\"J.AnthonyMovshon\",\"ToshioInui\",\"Y.Bengio\",\"AnthonyJ.R.Heading\",\"AlexSmola\",\"EMoss\",\"DanielNissman\",\"JuergenFritsch\",\"SharadSinghal\",\"KatalinM.Gothard\",\"EveMarder\",\"DanaRon\",\"LinaL.E.Massone\",\"RitaVenturini\",\"GiacomoM.Bisio\",\"EtienneBarnard\",\"StephenA.Fisher\",\"PeterL.Bartlett\",\"DoinaPrecup\",\"EdwinLewis\",\"S.K.Riis\",\"TadHogg\",\"Y.Xie\",\"W.H.Zaagman\",\"VijaySamalam\",\"Antalvan-den-Bosch\",\"PatrickGallinari\",\"HayitK.Greenspan\",\"ShumeetBaluja\",\"HongC.Leung\",\"EdwinR.Hancock\",\"AliA.Minai\",\"YasuharuKioke\",\"PaulNachtigall\",\"ReinerLenz\",\"R.Erlanson\",\"AapoHyvarinen\",\"RamaChellappa\",\"StephanPareigis\",\"H.Drucker\",\"ReinholdMann\",\"JonathanA.Marshall\",\"KevinCummings\",\"TomasoPoggio\",\"S.Yasui\",\"AlanLapedes\",\"FrancesS.Chance\",\"SanjayBiswas\",\"StefanSchaal\",\"C.T.Abdallah\",\"JoshuaB.Tenenbaum\",\"J.Tani\",\"P.Anandan\",\"N.Barkai\",\"J.JeffreyMahoney\",\"LaurensLeerink\",\"AkayshaC.Tang\",\"TomJ.Richardson\",\"T.Delbruck\",\"MasazumiKatayama\",\"TobiDelbruck\",\"ChristianDarken\",\"HowardKaushansky\",\"TongZhang\",\"A.L.Yuille\",\"PatriceSimard\",\"TakeoYamashita\",\"KnutMoller\",\"JianWu\",\"Hui-H.Hsu\",\"ToddTroyer\",\"ShigemTanaka\",\"R.S.Peterson\",\"ArchismartRudra\",\"PatrickHaffner\",\"RalfHerbrich\",\"JamesA.Ritcey\",\"ThomasBrown\",\"P.Campbell\",\"DavidMcAllester\",\"PeterE.Latham\",\"ChristophBregler\",\"NirLevy\",\"ManuelSamuelides\",\"AlanH.Kramer\",\"Shih-ChengYen\",\"TonyJebara\",\"EricPostma\",\"RogerShepard\",\"AlexanderJ.Smola\",\"IdanSegev\",\"HalbertWhite\",\"WilliamFaller\",\"DanaZ.Anderson\",\"W.R.Softky\",\"Guo-ZhengSun\",\"StefanHeil\",\"NigelM.Allinson\",\"StephenJudd\",\"MarcusFrean\",\"FranklinR.Arethor\",\"KristinaJohnson\",\"AntonSchwartz\",\"WilliamW.Cohen\",\"AmyMcGovern\",\"EmadN.Eskandar\",\"GwendalLeMasson\",\"R.D.Puff\",\"WalterMe
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what should be the authors if tags are topics?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel free to link to a GIST with your attempt. it will help people

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to write "You can treat the tags on the posts as authors" not "as topics". Fixed.

The model is closely related to Latent Dirichlet Allocation. The AuthorTopicModel class
inherits the LdaModel class, and its usage is thus similar.

Distributed compuation and multiprocessing is not implemented at the moment, but may be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please link to ipynb tutorial on nips from here and to the author-topic paper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

# are included in the code where this is the case, for example in the log_perplexity
# and do_estep methods.

# FIXME: link to tutorial in docstring above, once the tutorial is available.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove fixme. it is resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, pushed that fix now.

@tmylk tmylk merged commit 739f34e into piskvorky:develop Jan 17, 2017
@tmylk tmylk changed the title [WIP] Author-topic model Author-topic model Feb 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
difficulty hard Hard issue: required deep gensim understanding & high python/cython skills feature Issue described a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants