-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author-topic model #893
Author-topic model #893
Conversation
…ent likelihood measure. OnlineAtVb now extends (inherits) LdaModel. Other minor changes.
…o eval_liklihood.
…rparameters. Updated notebook accordingly.
…pts it. Still work to be done in that area.
…thm works. Updated notebook.
…appened to the offline lately.
…moved author_prior_prob from mu update.
…malize. Various other changes.
…nly occasionally. Updated notebook.
… use of log_normalize in offline algo. Update notebook.
…speed up large experiments. Made it possible to initialize the model with LDA topics (lambda).
…s. Also fixed the 'endclass' comment.
…even some mistakes. Cleaned up the code (atmodel.py and tests) w.r.t. PEP8 (disregarding E501, E731, E12 and W503) and removing vertical indent.
@@ -836,24 +834,28 @@ def bound(self, chunk, chunk_doc_idx=None, subsample_ratio=1.0, author2doc=None, | |||
if not self.author2doc.get(a): | |||
raise ValueError('bound cannot be called with authors not seen during training.') | |||
|
|||
chunk_doc_idx = xrange(len(chunk)) | |||
if chunk_doc_idx: | |||
raise ValueError('Either author dictionaries or chunk_doc_idx must be prodivded, not both. Consult documentation of bound method.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*typo in provided
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@@ -0,0 +1,2097 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a part of the codebase? it is a good regression performance test. Could you keep it as a gist elsewhere and keep a link to it in the atmodel.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't going to include it at all, it's just easiest for me to keep it there while developing. I have now removed it from the repo.
self.numworkers = 1 | ||
else: | ||
# NOTE: distributed processing is not implemented for the author-topic model. | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise a not implemented exception to be explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why I put that there, distributed=False
is set explicitly (it is not an input). So now I just do the following:
distributed = False
self.dispatcher = None
self.numworkers = 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also a comment above that about implementing a distributed version (should be line 209).
if minimum_probability is None: | ||
minimum_probability = self.minimum_probability | ||
|
||
# NOTE: this is used in LdaModel: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this comment mean? please expand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well. I was really sceptical about doing that. But now I just decided to do it anyway. So the note is removed and the line is uncommented (so it is doing what LdaModel is doing now).
# Licensed under the GNU LGPL v2.1 - http://www.gnu.org/licenses/lgpl.html | ||
|
||
""" | ||
Automated tests for checking transformation algorithms (the models package). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please update the docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixmes in ipynb
# Perhaps test that the bound increases, in general (i.e. in several of the tests below where it makes | ||
# sense. | ||
|
||
# FIXME: remember to remove this, once done using it: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -0,0 +1,1182 @@ | |||
{ | |||
"cells": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multiple fixmes in this ipynb need to be resolved before merging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All fixmes removed.
…s well: removed do_mstep method (using LdaModels version directly), using minimum_probability in get_author_topics, removed statement (in log) that said perplexity is evaluated on held-out data.
Tutorial is done. @tmylk please review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ipynb minor changes
"* Pre-processing and training LDA: https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/lda_training_tips.ipynb\n", | ||
"\n", | ||
"\n", | ||
"> **NOTE:**\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just give the pip
commands
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
"\n", | ||
"As in the LDA tutorial, we will be performing qualitative analysis of the model, and at times this will require an understanding of the subject matter of the data. If you try running this tutorial on your own, consider applying it on a dataset with subject matter that you are familiar with. For example, try one of the [StackExchange datadump datasets](https://archive.org/details/stackexchange).\n", | ||
"\n", | ||
"You can download the data from Sam Roweis' website (http://www.cs.nyu.edu/~roweis/data.html).\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just add a wget cell please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
"source": [ | ||
"#### Plotting the authors\n", | ||
"\n", | ||
"Now we're going to produce the kind of pacific archipelago looking plot below. The goal of this plot is to give you a way to explore the author-topic representation in an intuitive manner.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create a blog on wordpress linking to this ipynb with this blue graph as the title image.
" var inline_js = [\n", | ||
" function(Bokeh) {\n", | ||
" Bokeh.$(function() {\n", | ||
" var docs_json = {\"e64bc00d-c5e8-48c6-85d7-9d719e821b4d\":{\"roots\":{\"references\":[{\"attributes\":{\"fill_alpha\":{\"value\":0.6},\"fill_color\":{\"value\":\"#1f77b4\"},\"line_color\":{\"value\":null},\"radius\":{\"field\":\"radii\",\"units\":\"data\"},\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y\"}},\"id\":\"0cb98164-2232-4447-a4b5-7c515c3f0dfd\",\"type\":\"Circle\"},{\"attributes\":{\"bottom_units\":\"screen\",\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"left_units\":\"screen\",\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"render_mode\":\"css\",\"right_units\":\"screen\",\"top_units\":\"screen\"},\"id\":\"a140a9ad-34dc-4a95-9824-7d2c71f5f40d\",\"type\":\"BoxAnnotation\"},{\"attributes\":{},\"id\":\"25df7d2f-cd83-424f-8959-3cff8b4f9b23\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{\"plot\":{\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"80b48167-b15d-4e4b-84db-f2ce63ed0e5c\",\"type\":\"BasicTicker\"}},\"id\":\"4d8ea3f6-76e1-4e3c-b7c9-7d5f2e4a163f\",\"type\":\"Grid\"},{\"attributes\":{\"callback\":null},\"id\":\"e1e4ea7e-aec1-4c9f-a0be-016ccbc3eaa4\",\"type\":\"DataRange1d\"},{\"attributes\":{\"formatter\":{\"id\":\"25df7d2f-cd83-424f-8959-3cff8b4f9b23\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"80b48167-b15d-4e4b-84db-f2ce63ed0e5c\",\"type\":\"BasicTicker\"}},\"id\":\"50a85230-5c72-4045-ac14-881f6f307baf\",\"type\":\"LinearAxis\"},{\"attributes\":{},\"id\":\"ea6a8c94-8c7d-4c1e-a218-083495e53e2b\",\"type\":\"ToolEvents\"},{\"attributes\":{\"plot\":{\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"7e83a056-f89c-4f03-bbde-aa8af0208811\",\"type\":\"ResetTool\"},{\"attributes\":{\"plot\":{\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"7f19a59b-e39a-4e87-811d-f0552f9a90af\",\"type\":\"WheelZoomTool\"},{\"attributes\":{\"callback\":null,\"plot\":{\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"tooltips\":[[\"author\",\"@author_names\"],[\"size\",\"@author_sizes\"]]},\"id\":\"a2e961ed-115b-4d7d-8905-c960988fa9f0\",\"type\":\"HoverTool\"},{\"attributes\":{},\"id\":\"1fb28c9f-40b7-40b9-be69-1b7e44039156\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{\"callback\":null,\"overlay\":{\"id\":\"afcc2163-43b4-4522-8f27-eec835514ddc\",\"type\":\"PolyAnnotation\"},\"plot\":{\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"bdad9ad5-95a7-4dd7-b3c3-5d040340d9a5\",\"type\":\"LassoSelectTool\"},{\"attributes\":{\"active_drag\":\"auto\",\"active_scroll\":\"auto\",\"active_tap\":\"auto\",\"tools\":[{\"id\":\"a2e961ed-115b-4d7d-8905-c960988fa9f0\",\"type\":\"HoverTool\"},{\"id\":\"936102af-cadc-4ada-880c-1ae1e1f451ed\",\"type\":\"CrosshairTool\"},{\"id\":\"9651b508-3ce2-4350-891e-aca5a9d9c0fe\",\"type\":\"PanTool\"},{\"id\":\"7f19a59b-e39a-4e87-811d-f0552f9a90af\",\"type\":\"WheelZoomTool\"},{\"id\":\"9e509916-8395-4d8a-b781-f9a92b207e84\",\"type\":\"BoxZoomTool\"},{\"id\":\"7e83a056-f89c-4f03-bbde-aa8af0208811\",\"type\":\"ResetTool\"},{\"id\":\"c0e30173-6a62-439d-afdd-cbacc8447f65\",\"type\":\"SaveTool\"},{\"id\":\"bdad9ad5-95a7-4dd7-b3c3-5d040340d9a5\",\"type\":\"LassoSelectTool\"}]},\"id\":\"d5adedf2-aede-40d1-8932-4b3686e6d1fb\",\"type\":\"Toolbar\"},{\"attributes\":{},\"id\":\"b1533272-adf8-4cd1-8b85-a38220ef2199\",\"type\":\"BasicTicker\"},{\"attributes\":{\"below\":[{\"id\":\"50a85230-5c72-4045-ac14-881f6f307baf\",\"type\":\"LinearAxis\"}],\"left\":[{\"id\":\"b23536e4-ee8e-45b2-b1b7-8c23a21ebdc6\",\"type\":\"LinearAxis\"}],\"renderers\":[{\"id\":\"50a85230-5c72-4045-ac14-881f6f307baf\",\"type\":\"LinearAxis\"},{\"id\":\"4d8ea3f6-76e1-4e3c-b7c9-7d5f2e4a163f\",\"type\":\"Grid\"},{\"id\":\"b23536e4-ee8e-45b2-b1b7-8c23a21ebdc6\",\"type\":\"LinearAxis\"},{\"id\":\"d9a2f81a-6ad0-494d-b9b2-519d5466bf55\",\"type\":\"Grid\"},{\"id\":\"a140a9ad-34dc-4a95-9824-7d2c71f5f40d\",\"type\":\"BoxAnnotation\"},{\"id\":\"afcc2163-43b4-4522-8f27-eec835514ddc\",\"type\":\"PolyAnnotation\"},{\"id\":\"69fd2b1b-2091-4ca4-b58d-758329ef159c\",\"type\":\"GlyphRenderer\"}],\"title\":{\"id\":\"ce0bfd0a-0b2b-4e69-b780-4a290b62b2a3\",\"type\":\"Title\"},\"tool_events\":{\"id\":\"ea6a8c94-8c7d-4c1e-a218-083495e53e2b\",\"type\":\"ToolEvents\"},\"toolbar\":{\"id\":\"d5adedf2-aede-40d1-8932-4b3686e6d1fb\",\"type\":\"Toolbar\"},\"x_range\":{\"id\":\"e1e4ea7e-aec1-4c9f-a0be-016ccbc3eaa4\",\"type\":\"DataRange1d\"},\"y_range\":{\"id\":\"69670d03-aa44-47b5-857f-83ad85f61c02\",\"type\":\"DataRange1d\"}},\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"},{\"attributes\":{\"dimension\":1,\"plot\":{\"id\":\"5fca9abb-8d94-4900-a388-d2a3b67dc489\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"b1533272-adf8-4cd1-8b85-a38220ef2199\",\"type\":\"BasicTicker\"}},\"id\":\"d9a2f81a-6ad0-494d-b9b2-519d5466bf55\",\"type\":\"Grid\"},{\"attributes\":{\"callback\":null,\"column_names\":[\"author_sizes\",\"x\",\"author_names\",\"radii\",\"y\"],\"data\":{\"author_names\":[\"O.I.Tsioutsias\",\"MichaelHumphreys\",\"MatthewA.Wilson\",\"RyotaroKamimura\",\"KariTorkkola\",\"DavidFeld\",\"ThomasClare\",\"HervdBourlard\",\"BernhardSchottky\",\"SvilenTzonev\",\"SatoshiYamada\",\"GuyJ.Brown\",\"RobertSnapp\",\"R.R.de-Ruyter-van-Steveninck\",\"JeremyFrank\",\"MiriamSchulte\",\"DiegoSona\",\"MichaelKearns\",\"TakaoWatanabe\",\"MarcH.Cohen\",\"AjayGupta\",\"AlessandroSperduti\",\"SmartGeman\",\"T.Nakai\",\"S.Schaal\",\"J.Walter\",\"BrianRasnow\",\"GaryM.Scott\",\"S.C.Ahalt\",\"ChristopherG.Atkeson\",\"HiroakiGomi\",\"KevinA.Archie\",\"Te-WonLee\",\"DonaldB.Malkoff\",\"EnricoBocchieri\",\"RonaldH.Silverman\",\"GregMartin\",\"MishaPavel\",\"I.Jouny\",\"ToshiakiOkamoto\",\"DavidAndre\",\"A.H.L.West\",\"PeterFoltz\",\"J.Deppisch\",\"ParthaNiyogi\",\"AapoHyvarrinen\",\"AntoninaStarita\",\"DanielPotter\",\"Jen-LunYuan\",\"PeterBartlett\",\"RichardJ.Coggins\",\"HarrisDrucker\",\"StefanoMonti\",\"DavidMarson\",\"GaleL.Martin\",\"MosheKaro\",\"S.SidneyFels\",\"JoseA.B.Fortes\",\"YoramSinger\",\"SatinderSingh\",\"GideonDror\",\"L.Xu\",\"HidemitsuOgawa\",\"HowardHenry\",\"MatthewJ.Beal\",\"HermanVerrelst\",\"ZoubinGhahramani\",\"D.S.Touretzky\",\"AvijitSaha\",\"RandallR.Spangler\",\"DaphneBavelier\",\"YvesGrandvalet\",\"MichaelDuff\",\"DavidA.Kessler\",\"JosephB.Keller\",\"RonaldG.Benson\",\"AlanJ.Harget\",\"RichardK.Belew\",\"MarkGluck\",\"A.B.Bonds\",\"D.D.Coon\",\"OleWinther\",\"JohnKruschke\",\"CatherineE.Myers\",\"EricI.Knudsen\",\"AmirDembo\",\"ChristophE.Schreiner\",\"MalcolmSlaney\",\"AndreStechert\",\"AliH.Sayed\",\"L.Y.Pratt\",\"AndrewBlake\",\"EricVittoz\",\"MarioBlaum\",\"AdamJ.Grove\",\"A.Horst\",\"J.Hajto\",\"YosefRinott\",\"MichaelFleisher\",\"AhChungTsoi\",\"KanBoonyanit\",\"Andrevan-Schaik\",\"HiroyukiNakahara\",\"ToddSoukup\",\"ThorsteinnS.Rognvaldsson\",\"B.Flower\",\"SrinageshSatyanarayana\",\"JohnBain\",\"MichaelS.Gray\",\"PeterRappelsberger\",\"MartinI.Sereno\",\"AndrewMoore\",\"JesperVedelsby\",\"J.A.F.Leite\",\"T.Maxwell\",\"ConradC.Galland\",\"MichaelM.Merzenich\",\"P.Leong\",\"MarkPlutowski\",\"TimHoriuchi\",\"C.Koch\",\"H.G.Zimmermann\",\"D.S.Tang\",\"DavidG.Ward\",\"JonathanBaxter\",\"JoachimUtans\",\"CyrilLatimer\",\"PeterAdorjan\",\"M.Gilloux\",\"PatrickMoore\",\"RonaldL.Calabrese\",\"TheoGeisel\",\"A.Gersho\",\"FabioSolari\",\"A.Sangiovanni-Vincentelli\",\"R.M.Borisyuk\",\"KenjiMatsumoto\",\"WesleyE.Snyder\",\"IraG.Smotroff\",\"GeoffreyE.Hinton\",\"EmanuelaBricolo\",\"H.U.Bauer\",\"A.MiguelSanMartin\",\"JeffreyEMonaco\",\"KevinE.Martin\",\"TimothyChiu\",\"PenttiKanerva\",\"DanielM.Wolpert\",\"JamesR.Williamson\",\"MartinJ.Johnson\",\"B.G.Home\",\"NadaE.Matic\",\"UriRokni\",\"DavidScheeff\",\"EdgarA.Brown\",\"RandallD.Beer\",\"S.Liu\",\"JoumanaGhosn\",\"A.Pentland\",\"BrendaClaiborne\",\"Bertde-Vries\",\"TadahiroOhmi\",\"J.C.Pearson\",\"DeirdreW.Wheeler\",\"H.H.Chen\",\"Benjaminvan-Roy\",\"SylvieRyckebusch\",\"ChristopherBowman\",\"StevenA.Harp\",\"A.Moopenn\",\"E.Littmann\",\"AndreasStolcke\",\"R.Sitaramen\",\"GeoffreyGoodhill\",\"AthanasiosG.Tsirukis\",\"EricChang\",\"W.FritzKruger\",\"A.vanSchaik\",\"PadhraicSmyth\",\"DavidS.Touretzky\",\"AlanBarr\",\"ThomasH.Brown\",\"AlanF.Murray\",\"LouiseOsterholtz\",\"B.V.K.VijayaKumar\",\"GerardDreyfus\",\"JohnK.Williams\",\"EricB.Baum\",\"MichaelIsard\",\"ChristopherAtkeson\",\"LanceM.Optican\",\"GeoffreyOrsak\",\"DavidA.Robinson\",\"JohnBaras\",\"AmitManwani\",\"AnthonyLaVigna\",\"MichaelI.Jordan\",\"ShellyGoggin\",\"M.A.Jabri\",\"JamesK.Peterson\",\"EytanDomany\",\"RobertE.Schapire\",\"J.Beck\",\"MarwanJabri\",\"ToddS.Braver\",\"UryNaftaly\",\"M.S.Bartlett\",\"RonaldA.Cole\",\"EHergert\",\"G.G.Blasdel\",\"JessicaD.Bayliss\",\"ThomasPetsche\",\"H.Pan\",\"GarethJames\",\"IsaacMeilijson\",\"MazinRahim\",\"W.R.Gardner\",\"DavidMontana\",\"S.Baluja\",\"RichardGolden\",\"AlbertoSangiovanni-Vincentelli\",\"KennethA.Norman\",\"ParryHusbands\",\"RichardFozzard\",\"NevinL.Zhang\",\"JacquesGautrals\",\"EyalCohen\",\"T.Mitchell\",\"ScottKirkpatrick\",\"G.Dreyfus\",\"BarbaraKlein\",\"LionelTarassenko\",\"L.D.Jackel\",\"CharlesL.Isbell\",\"StephenJ.Hanson\",\"AdamPrtigel-Bennett\",\"M.Mahowald\",\"RichardLyon\",\"HeikoNeumann\",\"BartlettW.Mel\",\"BerndFritzke\",\"MosheSipper\",\"RicardoA.MarquesPereira\",\"MichaelSeibert\",\"S.Thrun\",\"J.L.Elman\",\"T.Hastie\",\"JohnE.W.Mayhew\",\"AchimStahlberger\",\"EricMjolsness\",\"HeinzSchuster\",\"ChrisJ.C.Burges\",\"MarioMarchand\",\"GeoffreyFox\",\"Meng-JangLin\",\"PaulEkman\",\"ColinHumphties\",\"GennadyS.Cymbalyuk\",\"DonnieHenderson\",\"C.A.Micchelli\",\"MarwanJabd\",\"JackL.Meador\",\"RaoulTawel\",\"AlirezaKhotanzad\",\"ThomasRagg\",\"VolkerRoth\",\"A.Sergejew\",\"HeinrichH.Btilthoff\",\"NicholasR.Howe\",\"DougJohnson\",\"SantoshS.Venkatesh\",\"AlanEMurray\",\"C.M.Bishop\",\"JaakkoHollmen\",\"D.Chen\",\"GeoffreyTowell\",\"CristophBregler\",\"StevenNowlan\",\"DavidSomers\",\"BjomLambrigsten\",\"JamesJ.Knierim\",\"YiLi\",\"TimothyW.Cacciatore\",\"D.M.Titterington\",\"SatoruShiono\",\"ChienPingLu\",\"CraigT.Jin\",\"AndrewW.Moore\",\"A.J.Bell\",\"VwaniRoychowdhury\",\"L.F.Abbott\",\"B.Parmanto\",\"StephenChurcher\",\"N.Toomarian\",\"NaftaliTishby\",\"StephenPiche\",\"P.S.Bradley\",\"ArmandoManduca\",\"NeilLawrence\",\"EricCourchesne\",\"AntonGunzinger\",\"AnthonyV.W.Smith\",\"ChristopherJ.Merz\",\"O.Miller\",\"AndrewM.Finch\",\"PeterTifio\",\"H.S.Baird\",\"C.Stevens\",\"A.Afghan\",\"MarianS.Bartlett\",\"MarkA.Rubin\",\"NiallMcLoughlin\",\"CharlesElbaurn\",\"DanielM.Kammen\",\"DominikHornel\",\"JohnPlatt\",\"JoeTebelskis\",\"ToddK.Leen\",\"AlfonsoRenart\",\"M.Kearns\",\"GenevieveB.Orr\",\"LeemonBaird\",\"JimChristian\",\"VicenteHonrubia\",\"GaryBradshaw\",\"J.Hertz\",\"DmitriB.Chklovskii\",\"Tzi-DarChiueh\",\"AlbertoBertoni\",\"ShimonEdelman\",\"KwokFaiHui\",\"MarkOllila\",\"DavidL.Bisset\",\"ManfredK.Warmuth\",\"PaatRusmevichiemong\",\"J.Sirosh\",\"T.Petsehe\",\"SteliosM.Smimakis\",\"DanRoth\",\"LanceR.Williams\",\"RuthJ.Williams\",\"PietroPerona\",\"AssafJ.Zeevi\",\"N.H.Wulff\",\"T.Rebotier\",\"KlausPrank\",\"R.K.Alley\",\"R.Gourley\",\"PaulViola\",\"BhaskarDasGupta\",\"Andrfivan-Schaik\",\"DavidMarsan\",\"MarkusSvensen\",\"A.Sato\",\"TobiasMann\",\"NelloCristianini\",\"YoavFreund\",\"MarthaFarah\",\"FranklinJ.Rudolph\",\"PeterKazlas\",\"G.Jackson\",\"JimKeeler\",\"HaimSompolinsky\",\"SophieDeneve\",\"CarreitSahar-Pikielny\",\"MarioP.Vecchi\",\"HansP.Graf\",\"J.Larsen\",\"AnthonyJayakumar\",\"LeonardG.C.Hamey\",\"JamesBower\",\"TakashiOnoda\",\"YuzoHirai\",\"E.Mjolsness\",\"MarkSaffman\",\"DavidE.VandenBout\",\"GfintherPalm\",\"GianlucaBontempi\",\"W.Ross\",\"Y.Cboe\",\"G.L.Martin\",\"Y.Zhao\",\"ShaiFine\",\"HagaiAttias\",\"JohnE.Hogden\",\"VolkerTresp\",\"YoshiroMiyata\",\"Chuan-LinWu\",\"DanielS.Clouse\",\"JohnMakhoul\",\"PhilippeO.Pouliquen\",\"Ting-ChuenPong\",\"NirFriedman\",\"W.Hubbard\",\"ThomasAnastasio\",\"KristRoginski\",\"PhilipM.Long\",\"JamesGlynn\",\"VictorZue\",\"DavidServan-Schreiber\",\"FrankWilczek\",\"A.G.Barto\",\"DietrichWettschereck\",\"C.E.Schreiner\",\"ChrisM.Bishop\",\"YasuhiroWada\",\"A.P.Thakoor\",\"JoshuaTenenbaum\",\"I.Guyon\",\"JeffMellstrom\",\"R.Etienne-Cummings\",\"Ki-ChulKim\",\"BrunoCessac\",\"G.Cauwenberghs\",\"FrancisQuek\",\"B.V.Roy\",\"ThomasG.Edwards\",\"JurgenHollatz\",\"PhillipAlvelda\",\"A.N.Michel\",\"DianeLitman\",\"AndreJ.Noest\",\"JosephCollard\",\"PaulDean\",\"GianlucaDonato\",\"JackGelfand\",\"EnnioMingolla\",\"Z.Chi\",\"MichaelG.Dyer\",\"PaoloGaudiano\",\"NigelDuffy\",\"C.F.Beckmann\",\"EsterLevin\",\"Y.Konig\",\"L.C.Parra\",\"JoshuaChover\",\"R.E.Jenkins\",\"AkitoSakurai\",\"GalChechik\",\"EricVatikiotis-Bateson\",\"EdwardW.Kairiss\",\"JoaoF.G.de-Freitas\",\"S.Yu\",\"W.ScottStornetta\",\"B.Yuhas\",\"IlSongHan\",\"NormanYarvin\",\"D.P.Helmbold\",\"JosefZihl\",\"PaulM.Chau\",\"FerdinandoMussa-lvaldi\",\"AjayN.Jain\",\"RobertB.Darling\",\"HemantS.Kudrimoti\",\"HarveyKasdan\",\"TommiJaakkola\",\"DeLiangWang\",\"MichaelChuang\",\"L.C.Dixon\",\"PrahladGupta\",\"A.W.Moore\",\"JohnPearson\",\"FlorisTakens\",\"LloydWatts\",\"P.N.Sabes\",\"PatrickAgin\",\"LesAtlas\",\"KurtFleischer\",\"R.Miikkulainen\",\"M.Marchand\",\"DarkoStefanovic\",\"RichardG.M.Morris\",\"RaphaelFeraud\",\"A.Kowalczyk\",\"DavidTouretzky\",\"W.ThomasMiller\",\"RaymondL.Watrous\",\"SmartRussell\",\"A.M.Annaswamy\",\"JosephPolifroni\",\"EdwardReitman\",\"RonaldSverdlove\",\"ChristianLebiere\",\"DavidNix\",\"MarkZlochin\",\"MarkKvale\",\"DavidHaussler\",\"RobertoPieraccini\",\"FangyuGao\",\"StephenP.DeWeerth\",\"ChongGu\",\"EdwardSchwartz\",\"ChristianeLinster\",\"SteveWaterhouse\",\"MichaelGasser\",\"MichaelBrownlow\",\"GeraldSommer\",\"Mohammed-AbdelGhani\",\"LaurentItti\",\"LauranceT.Maloney\",\"D.S.C.So\",\"AmirF.Atiya\",\"EduardoD.Sontag\",\"JosefZeitlhofer\",\"H.R.Doyle\",\"DanielD.Lee\",\"M.W.Pealersen\",\"IdoKanter\",\"C.E.Rasmussen\",\"RichardZemel\",\"H.Bolouri\",\"StephenM.Omohundro\",\"JackL.Gallant\",\"MishaMahowald\",\"DavidRogers\",\"LeifH.Finkel\",\"PeterMarbach\",\"C.L.Fry\",\"PeterF.Rowat\",\"JordanPollack\",\"E.Domany\",\"Mark.RSydorenko\",\"MiguelA.Carreira-Perpinan\",\"DavidLowe\",\"AndrewBack\",\"MaryTabasko\",\"R.Janow\",\"Ming-HsuanYang\",\"KamilA.Grajski\",\"JoseAmbros-Ingerson\",\"J.D.Cowan\",\"ErikD.Lumer\",\"EdwardStern\",\"C.Kenyon\",\"C.J.Wellekens\",\"DeanPomerleau\",\"G.Indiveri\",\"NorbertoM.Grzywacz\",\"StephenG.Lisberger\",\"JamesM.Goodwin\",\"RichardJ.Mammone\",\"J.Baxter\",\"LeonN.Cooper\",\"StephenCox\",\"SubutaiAhmad\",\"CarlE.Rasmussen\",\"MikeSchuster\",\"LarryYaeger\",\"E.Erwin\",\"M.J.Rose\",\"KukjinKang\",\"R.Zecchina\",\"LeonidKruglyak\",\"RafaelMalach\",\"P.Stone\",\"MichelCrepon\",\"JianfengFeng\",\"MartineNaillon\",\"E.Ersu\",\"VerenaHebler\",\"K.VenkateshPrasad\",\"AlexPentland\",\"RogerCheng\",\"PavelLaskov\",\"JoachimBuhmann\",\"FransM.Coetzee\",\"D.Sherrington\",\"AnandRangarajan\",\"JosefSkrzypek\",\"AndrewR.Barron\",\"MichaelJ.Pazzani\",\"UsamaFayyad\",\"DanielKammen\",\"E.Zohary\",\"AdAertsen\",\"J.Alspector\",\"M.Blatt\",\"J.C.Jackson\",\"JohnKolen\",\"BarakPearlmutter\",\"JimSchimert\",\"DimitriBertsekas\",\"MarkusSchenkel\",\"DavidHelmbold\",\"HisashiSuzuki\",\"Jean-PierreNadal\",\"HananDavidowitz\",\"EduardSackinger\",\"ClaudineMasson\",\"KahKaySung\",\"AndreElisseeff\",\"DeanBrettle\",\"R.ChristopherdeCharms\",\"StevenS.Watkins\",\"D.Brandeis\",\"KevinR.Wheeler\",\"StephenOmohundro\",\"DavidPrice\",\"DaweiDong\",\"AlanH.Barr\",\"H.Yang\",\"MauriceLee\",\"TomHeskes\",\"ByronDom\",\"JeffreyR.LaFranchise\",\"XavierBoyen\",\"K.Y.MichaelWong\",\"G.Zavaliagkos\",\"SheriL.Gish\",\"AnyaC.Hurlbert\",\"VirginiaR.de-Sa\",\"BernardDoyon\",\"HenrikBohr\",\"HongLeung\",\"F.B.Rodriguez\",\"ShigeruTanaka\",\"AlexanderT.Ihler\",\"RichardP.Lippmann\",\"RichardO.Duda\",\"KevinJ.Moon\",\"JohnA.Hertz\",\"HarryPrintz\",\"TimothyS.Wilkinson\",\"MichaelCohen\",\"VictorAbrash\",\"JohnMoody\",\"AmnonShashua\",\"KathrynLaskey\",\"BalazsKegl\",\"HilbertJ.Kappen\",\"MauriceMilgram\",\"A.Zador\",\"FrankMoss\",\"AliceM.Chiang\",\"LiuKe\",\"VitalyMaiorov\",\"M.Finke\",\"J.Bernasconi\",\"RajeshRao\",\"NoboruMumta\",\"SteveRenals\",\"J.W.Shavlik\",\"N.E.Berthier\",\"F.Ohl\",\"E.Vittoz\",\"MatthiasBurger\",\"BarnbangParmanto\",\"JamesA.Simmons\",\"AkiraHayashi\",\"AlexP.Pentland\",\"AndreasZiehe\",\"DavidGoodine\",\"YurikoOshima-Takane\",\"EliShamir\",\"CarlosMejia\",\"TonyRobinson\",\"AndrewS.Noetzel\",\"ColinCampbell\",\"JohnH.Holland\",\"B.A.Pearlmutter\",\"JofioF.G.de-Freitas\",\"V.I.Makarenko\",\"H.Wagner\",\"Jean-FranqoisIsabelle\",\"A.Jagota\",\"DanielRuderman\",\"RodolfoMilito\",\"KimmoKiviluoto\",\"J.Kivinen\",\"KurtHornik\",\"TomM.Mitchell\",\"MohammadA.Al-Ansari\",\"JohnKassebaum\",\"ManoelF.Tenorio\",\"BarbaraWold\",\"AlexanderDimitrov\",\"JohnLoch\",\"LynetteHirschman\",\"DavidA.Cohn\",\"H.A.Rowley\",\"M.Cekic\",\"JohnHertz\",\"JonTombs\",\"LorienY.Pratt\",\"GeraldTesauro\",\"A.Pouget\",\"DemetriTerzopoulos\",\"JustinianRosca\",\"KaganTumer\",\"MichaelE.Hasselmo\",\"G.J.Zelinsky\",\"M.Wattenberg\",\"LeonS.Sterling\",\"B.Lemarie\",\"StevenBradtke\",\"J.PNadal\",\"ChristopheAndrieu\",\"T.Serrano-Gotarredona\",\"S.A.Macy\",\"JohnJ.Hopfield\",\"T.Jung\",\"C.M.Marcus\",\"GrigorisKarakoulas\",\"ZhaopingLi\",\"RobertMoll\",\"ZoranObradovic\",\"BillHorne\",\"DeirdreWheeler\",\"P.Koiran\",\"RodneyCotterill\",\"GaryG.Blasdel\",\"T.Kanade\",\"AkioTanaka\",\"EricA.Wan\",\"R.Schwartz\",\"JorgKindermann\",\"DavidJ.Foster\",\"YannisTsividis\",\"ManceHarmon\",\"CazhaowS.Qazaz\",\"SayandevMukherjee\",\"HinrichSchfitze\",\"DavidHorn\",\"GeoffreyJ.Goodhill\",\"H.Bourlard\",\"VladimirN.Vapnik\",\"NirmalaRamanujam\",\"MarkGerules\",\"DaphneKoller\",\"KarlGustafson\",\"PhilKohn\",\"D.J.Kershaw\",\"Hung-LiTseng\",\"SylvieRenaud-LeMasson\",\"AmirAtiya\",\"ShinIshii\",\"JamesHendler\",\"P.Sollich\",\"JohnW.Miller\",\"HilaryTunley\",\"JeffreyTeeters\",\"RichardGranger\",\"BradleyA.Minch\",\"GeorgeZavaliagkos\",\"IgorGrebert\",\"J.G.Taylor\",\"RodericGinpen\",\"AlexanderSinger\",\"AllenI.Selverston\",\"MarkA.Gluck\",\"PeterN.Steinmetz\",\"GerhardRigoll\",\"LauraMartignon\",\"JohnO'Keefe\",\"J.VanderSpiegel\",\"JimMann\",\"YochaiKonig\",\"JeremyS.de-Bonet\",\"EricSchnell\",\"Robde-Ruyter-van-Steveninck\",\"N.N.Schraudolph\",\"FrankEeckman\",\"PekkaOrponen\",\"HermanFerra\",\"FernandoLozano\",\"AnthonyBloesch\",\"RichardKempter\",\"HenriAtlan\",\"SandeepGulati\",\"Dit-YanYeung\",\"E.D.Sontag\",\"MartinHammer\",\"JohnK.Douglass\",\"DaleSchuurmans\",\"YoshioTakane\",\"R.Caruana\",\"JenniferLund\",\"BenS.Wittner\",\"NicholasRoy\",\"MichaelP.Stryker\",\"PaoloFrasconi\",\"B.Boser\",\"MarvinLuttges\",\"SteveGyger\",\"DavidA.Nix\",\"J.A.Farrell\",\"JohnKennedy\",\"EricSaund\",\"FouadBadran\",\"JavierR.Movellan\",\"H.Ritter\",\"StevenJ.Nowlan\",\"C.J.C.H.Watkins\",\"L.Wu\",\"SebastianMika\",\"JosephSill\",\"RonaldCole\",\"Hwang-SooLee\",\"Ch.Tietz\",\"S.Gold\",\"D.Lippe\",\"AndresRodriguez\",\"ChristofSchofl\",\"BernhardScholkopf\",\"PedroA.d.F.R.Hojen-Sorensen\",\"YoshiyukiKabashima\",\"HarrisonMonFookLeong\",\"Jung-WookCho\",\"HenrikFredholm\",\"MoiseH.Goldstein\",\"LouisCeci\",\"LawrenceD.Jackel\",\"AnthonyZador\",\"CharlesFefferman\",\"AnnaCorderoy\",\"Alexandervon-zur-Muhlen\",\"RonaldJ.Williams\",\"GeorgSchnitger\",\"RonPapka\",\"BernardVictorri\",\"RobertFrye\",\"JosePrincipe\",\"E.Majani\",\"S.Finch\",\"ChristopherAssad\",\"T.Duong\",\"T.Ohmi\",\"Ming-TakLeung\",\"EeroP.Simoncelli\",\"R.C.Williamson\",\"AndreasG.Andreou\",\"EduardoSontag\",\"M.M.Hochberg\",\"DavidShahian\",\"GadiPinkas\",\"StefanKnerr\",\"IanParberry\",\"SebastianThmn\",\"VijayR.Konda\",\"JensKohlmorgen\",\"DavidWarland\",\"AndrewR.Webb\",\"GeoffreyHinton\",\"RolfEckmiller\",\"RuthErlanson\",\"HansHenrikThodberg\",\"FredRieke\",\"D.T.Lawrence\",\"JohnLazzaro\",\"StephenPickard\",\"ViktorGruev\",\"P.R.Montague\",\"Wan-PingChiang\",\"BarakA.Pearlmutter\",\"J.AnthonyMovshon\",\"ToshioInui\",\"Y.Bengio\",\"AnthonyJ.R.Heading\",\"AlexSmola\",\"EMoss\",\"DanielNissman\",\"JuergenFritsch\",\"SharadSinghal\",\"KatalinM.Gothard\",\"EveMarder\",\"DanaRon\",\"LinaL.E.Massone\",\"RitaVenturini\",\"GiacomoM.Bisio\",\"EtienneBarnard\",\"StephenA.Fisher\",\"PeterL.Bartlett\",\"DoinaPrecup\",\"EdwinLewis\",\"S.K.Riis\",\"TadHogg\",\"Y.Xie\",\"W.H.Zaagman\",\"VijaySamalam\",\"Antalvan-den-Bosch\",\"PatrickGallinari\",\"HayitK.Greenspan\",\"ShumeetBaluja\",\"HongC.Leung\",\"EdwinR.Hancock\",\"AliA.Minai\",\"YasuharuKioke\",\"PaulNachtigall\",\"ReinerLenz\",\"R.Erlanson\",\"AapoHyvarinen\",\"RamaChellappa\",\"StephanPareigis\",\"H.Drucker\",\"ReinholdMann\",\"JonathanA.Marshall\",\"KevinCummings\",\"TomasoPoggio\",\"S.Yasui\",\"AlanLapedes\",\"FrancesS.Chance\",\"SanjayBiswas\",\"StefanSchaal\",\"C.T.Abdallah\",\"JoshuaB.Tenenbaum\",\"J.Tani\",\"P.Anandan\",\"N.Barkai\",\"J.JeffreyMahoney\",\"LaurensLeerink\",\"AkayshaC.Tang\",\"TomJ.Richardson\",\"T.Delbruck\",\"MasazumiKatayama\",\"TobiDelbruck\",\"ChristianDarken\",\"HowardKaushansky\",\"TongZhang\",\"A.L.Yuille\",\"PatriceSimard\",\"TakeoYamashita\",\"KnutMoller\",\"JianWu\",\"Hui-H.Hsu\",\"ToddTroyer\",\"ShigemTanaka\",\"R.S.Peterson\",\"ArchismartRudra\",\"PatrickHaffner\",\"RalfHerbrich\",\"JamesA.Ritcey\",\"ThomasBrown\",\"P.Campbell\",\"DavidMcAllester\",\"PeterE.Latham\",\"ChristophBregler\",\"NirLevy\",\"ManuelSamuelides\",\"AlanH.Kramer\",\"Shih-ChengYen\",\"TonyJebara\",\"EricPostma\",\"RogerShepard\",\"AlexanderJ.Smola\",\"IdanSegev\",\"HalbertWhite\",\"WilliamFaller\",\"DanaZ.Anderson\",\"W.R.Softky\",\"Guo-ZhengSun\",\"StefanHeil\",\"NigelM.Allinson\",\"StephenJudd\",\"MarcusFrean\",\"FranklinR.Arethor\",\"KristinaJohnson\",\"AntonSchwartz\",\"WilliamW.Cohen\",\"AmyMcGovern\",\"EmadN.Eskandar\",\"GwendalLeMasson\",\"R.D.Puff\",\"WalterMe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what should be the authors if tags are topics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feel free to link to a GIST with your attempt. it will help people
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant to write "You can treat the tags on the posts as authors" not "as topics". Fixed.
The model is closely related to Latent Dirichlet Allocation. The AuthorTopicModel class | ||
inherits the LdaModel class, and its usage is thus similar. | ||
|
||
Distributed compuation and multiprocessing is not implemented at the moment, but may be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please link to ipynb tutorial on nips from here and to the author-topic paper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
# are included in the code where this is the case, for example in the log_perplexity | ||
# and do_estep methods. | ||
|
||
# FIXME: link to tutorial in docstring above, once the tutorial is available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove fixme. it is resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, pushed that fix now.
@tmylk @piskvorky
I'm implementing the author-topic (AT) model, which is based on "The Author-Topic Model for Authors and Documents" by Rosen-Zvi and co-authors. This project is in connection with my masters thesis at the Technical University of Denmark.
As indicated in the PR title, this is a work in progress.
The original paper presents an algorithm based on collapsed Gibbs sampling. I have derived a variational Bayes algorithm to trained this model instead, and implemented that algorithm. Furthermore, I have made an online algorithm using the method described in "Online Learning For Latent Dirichlet Allocation" by Hoffman and co-authors.
At the moment, the algorithm runs and seems to converge. I intend to run some experiments to see if the resulting topics are good soon. I expect to find that I need to do some improvements, but we will see.