-
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add subsection on multiple observation model comparison #27
Conversation
Check out this pull request on Review Jupyter notebook visual diffs & provide feedback on notebooks. Powered by ReviewNB |
Thanks Oriol, this is a very good start! Regarding your questions:
Finally, I think this would benefit from prior and posterior predictive checks, as well as posterior plots and trace plots. I'm very happy to contribute these parts, although I don't think I'll have time this week 😬 |
I will get to watermark and sorting imports eventually 😄 , for now I prefer to focus on the content. i would like to eventually run it on pymc3.9 and latest ArviZ
As I said above, not sure this fits the goal of the section, there will be other sections on ppc, trace plots... We could look into making this a complete case study (and even extend the data to several years or several countries at the same time) but that would be a whole different story, I think this should be laser focused on ic plus multiple observations (this is why I am not even sure about model comparison, there already is a subsection on model comparison and I don't want to be too repetitive) |
I always like when there is more details to explain a method than less, but I'm trying to find the "least long" explanation possible: the discussion on exchangeability is necessary, and if you think the mathematical description of the likelihood is too to bring the point home, then let's add it too. And I guess while writing this we will see whether the reparametrization examples should be added too -- to be clear, I think they are of value, but they take a lot of place; it's a trade-off.
I see what you mean. A good idea could be to use this model and show how to improve it across various NBs of this repo, each NB being focused on one aspect of the iterative process. This would constitute a complete case-study, as you say, and I think this model is interesting because there are natural hierarchical structures (plus, it's not the Iris or Titanic datasets 😜 ).
The last two are confusing, but the first one is quite clear to me: what's the expected predictive accuracy of the models if we were trying to predict which team will score the next goal? But maybe I misunderstood what "leave one goal out" means 😆 |
all 3 are definitely confusing 😅, we are assessing the predictive accuracy of the model if we were predicting how many goals would one of the teams score in the match without constraining the team to be home or away. In match A vs B we want to predict how many goals will score A and how many will score B independently. It is similar to the match case so it makes sense the result is similar but there is a very important difference here our predictions are "easy" (scalar) but we have twice as many observations to predict whereas in match case our predictions are "hard" (we want to guess not one but 2 values) but we have less observations to predict. I don't know how to explain this better, which is why I think either math or implementation of what I mean will help. I also realized that we may want to keep this backend agnostic? @aloctavodia @canyon289 I guess that would decide towards loading the inference data from netcdf and stick to the math |
Quick comment: Will deeply review by EOD Saturday to provide better
feedback.
…On Wed, Jun 10, 2020 at 10:38 AM Oriol Abril-Pla ***@***.***> wrote:
goal/observation/"half match" could all be used but I think that all of
them can still be confusing
The last two are confusing, but the first one is quite clear to me: what's
the expected predictive accuracy of the models if we were trying to predict
which team will score the next goal? But maybe I misunderstood what "leave
one goal out" means
all 3 are definitely confusing 😅, we are assessing the predictive
accuracy of the model if we were predicting how many goals would one of the
teams score in the match without constraining the team to be home or away.
In match A vs B we want to predict how many goals will score A and how many
will score B *independently*. It is similar to the match case so it makes
sense the result is similar but there is a very important difference here
our predictions are "easy" (scalar) but we have twice as many
*observations* to predict whereas in match case our predictions are
"hard" (we want to guess not one but 2 values) but we have less
*observations* to predict. I don't know how to explain this better, which
is why I think either math or implementation of what I mean will help.
m_goals here
<https://nbviewer.jupyter.org/github/OriolAbril/calaix_de_sastre/blob/master/premier-hierarchical-model/premier-analytics.ipynb>
is the model corresponding to this confusing case
I also realized that we may want to keep this backend agnostic?
@aloctavodia <https://github.com/aloctavodia> @canyon289
<https://github.com/canyon289> I guess that would decide towards loading
the inference data from netcdf and stick to the math
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#27 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABXBFYIJU45SQ4WW5ZSRKMTRV7AIFANCNFSM4NYPOMJA>
.
|
Thanks Oriol, this makes complete sense 👌 |
To your question about splitting this out.
I would agree that explaining IC calculation split from comparison seems to make sense? As far as the code goes its a great example of how to do model checks with multiple likelihoods. As far as soccer/football goes I don't know what a half match is, so I lack the domain knowledge to know whats going on. I like the direction this is going, am interested to see the text that explains what is going on! |
I have extended the content with some math explanation and alternative implementations, now there is actually content to review, I am still not sure about what to include and what to exclude but most of the content was already written I basically had to gather the pieces and put them together. I think it will be clear now, most of the work should go into making this both clear and concise, it is still a little caothic. |
Thanks Oriol, just skimmed through it and it looks awesome! Here is a first batch of comments. |
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2020-06-12T10:00:17Z I wonder if "multiple-likelihoods models" isn't less confusing than "multi-observation models". I fear the latter could be interpreted as "you have multiple data points", not as "you have multiple likelihood distributions" OriolAbril commented on 2020-06-12T14:08:42Z Good point, I'll change this. I will probably agree with any proposal to reduce the number of times observation is used, I think it is ambiguous in this context. |
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2020-06-12T10:00:18Z
OriolAbril commented on 2020-06-12T14:30:17Z In the league there are 20 teams which means that there are 38 match days (they play twice against each other except themselves) and there are 10 matches every day.
AlexAndorra commented on 2020-06-13T14:15:12Z Ok, much clearer, thanks! |
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2020-06-12T10:00:18Z
OriolAbril commented on 2020-06-12T14:31:46Z I'd defer readers interested in that to go to the source of the model (I have in mind to add the sources on top like in the pymc3 example notebook). AlexAndorra commented on 2020-06-13T14:15:49Z Ow, there is a source for this model?? I'm interested! OriolAbril commented on 2020-06-13T17:12:39Z I used the code in pymc3 rugby example which in turn is based in premier example http://danielweitzenfeld.github.io/passtheroc/blog/2014/10/28/bayes-premier-league/
|
Good point, I'll change this. I will probably agree with any proposal to reduce the number of times observation is used, I think it is ambiguous in this context. View entire conversation on ReviewNB |
In the league there are 20 teams which means that there are 38 match days (they play twice against each other except themselves) and there are 10 matches every day.
View entire conversation on ReviewNB |
I'd defer readers interested in that to go to the source of the model (I have in mind to add the sources on top like in the pymc3 example notebook). View entire conversation on ReviewNB |
I still not have the time to review this, but as a general comment about this repository motivated by question 1. I think each notebook should focus on a single topic (and we can go as granular as we want), if we want to show a more complete "bayesian worflow" we should have a dedicated notebook (or notebooks) to do that. And we should try to discuss as much theory as possible (always keeping in mind the theoretical elements that are useful for the applications). As not every potential user of this repo will be interested in "going to deep" on the theoretical side we may have a "in depth section" per notebook. in fact a few of the already available notebook have it. I am not saying we must follow this pattern in every notebook, but we can use if necessary or desired. Another (not mutually exclusive) option is to have a few notebooks more theoretical and another more practical. |
Ok, much clearer, thanks! View entire conversation on ReviewNB |
Ow, there is a source for this model?? I'm interested! View entire conversation on ReviewNB |
Thanks @OriolAbril, this is really nice now! The explanation are very clear and the examples easy to follow 👏 |
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2020-06-13T15:34:04Z
|
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2020-06-13T15:34:05Z I love making ArviZ fail on purpose, very pedagogical. Here is what I'd write just after that cell, to expand on what you already wrote: "The error message is quite clear: ArviZ doesn't know what to do with the several likelihoods it found. This is because this information is neither in the model nor in the data, as we said above: we need to tell ArviZ what we're interested in. In other words, we are the boss here, and the model needs us more than we need it!" OriolAbril commented on 2020-06-13T17:00:53Z I would say one goal of the notebook is understanding why ArviZ fails with multiple likelihoods, however the error may be read by users who have not read the notebook, it is in this case that I am not sure the error provides any useful info. AlexAndorra commented on 2020-06-14T14:41:43Z Ow ok, outside of this NB context you mean. Maybe we can add a link to this NB in the error message, once the NB is merged?
|
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2020-06-13T15:34:06Z I think we should show how to do it in the order of the examples you cited above. So here it should deal with away goals instead of home goals. Or we can just change the bullet point above. OriolAbril commented on 2020-06-13T17:02:43Z yeah, I'll change that. I had both cases, but the only difference is changing the home by away, so I figured it was not worth it and I don't know why I erased the away one 🤷 |
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2020-06-13T15:34:07Z I'd modify the formulation a bit, to relate it more to what we just did. Something like: "Actually, what we just did corresponds to another, specific implementation of our base model. But this time, we'll target our model to the specific task of predicting the goals scored by away teams. Notice how we do not throw away..." OriolAbril commented on 2020-06-13T17:10:34Z I am still not sure about adding the alternative implementations, I would prefer to keep the notebook backend agnostic (I think it will be possible once the updated rugby data is available in ArviZ). I already had them so I decided to include them in case they could help any of you clarify some doubts.
Moreover, I think it will be better to use some diagrams to highlight the observations we are interested in. I'll add them whenever I have time. AlexAndorra commented on 2020-06-14T14:44:52Z Diagrams are a great idea! And it's true that if we add diagrams, then the other implementations of the model are less necessary (although interesting to keep somewhere). That'd be great if we could hide/show cells in Jupyter NBs (like with the new |
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2020-06-13T15:34:08Z
OriolAbril commented on 2020-06-13T17:15:20Z I agree if we eventually add the code it should probably have some explanation or link to a description of pm.Potential, as I said above though, I am not sure about this being inside the scope of the notebook. |
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2020-06-13T15:34:08Z
OriolAbril commented on 2020-06-13T17:29:15Z I think the second bullet point is a little confusing/imprecise: what we are doing here is assessing the predictive accuracy of predicting the outcome of the whole match. If our ultimate goal is the one above (going to matches that have the most probability of ending 3-3 or 4-4) this strategy would be the one to use in order to get the predictive accuracy of the desired "observation" (if we had several models, we would compare them based on these loo values instead of away team predictive accuracy). But there are other cases where the interest lies in the whole outcome of the match, the 3-3 case is only one example. Other silly examples could be: wanting to go to the dullest match because you don't care much about the match and want to be able to talk during it (maybe you are bringing your significant other); having some kind of fetish for matches that end up 4-1; predicting the whole outcome of matches to bet on them...
In the same betting example, you may realize there is more money in guessing the goals of the away team than the goals of the home team and therefore you'd compare models with the away goals metric (like the supporters with low budget). I don't want to use betting examples though. AlexAndorra commented on 2020-06-14T14:52:29Z To be clear, I had already understood what you explicited above with the material already present in the NB. I like your point about changing the precise example, instead of taking the same as in the bullet point. That way, people will understand that 3-3 or 4-4 is just an example. Something like: "Here, the utility we're trying to maximize, as slightly nerdy football fans, is the pairs of goals -- matches that best fit our football tastes could be those that end up 4-4, or those with the lowest number of goals because we don't care much about the game and want to be able to talk during it, or those that end up with a precise score, like 6-2 (because we also love tennis)." |
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2020-06-13T15:34:09Z "As in our first example, this predictive task corresponds to a specific model that we could have written as follows in the first place:" |
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2020-06-13T15:34:10Z
|
I would say one goal of the notebook is understanding why ArviZ fails with multiple likelihoods, however the error may be read by users who have not read the notebook, it is in this case that I am not sure the error provides any useful info. View entire conversation on ReviewNB |
yeah, I'll change that. I had both cases, but the only difference is changing the home by away, so I figured it was not worth it and I don't know why I erased the away one 🤷 View entire conversation on ReviewNB |
I am still not sure about adding the alternative implementations, I would prefer to keep the notebook backend agnostic (I think it will be possible once the updated rugby data is available in ArviZ). I already had them so I decided to include them in case they could help any of you clarify some doubts.
Moreover, I think it will be better to use some diagrams to highlight the observations we are interested in. I'll add them whenever I have time. View entire conversation on ReviewNB |
I used the code in pymc3 rugby example which in turn is based in premier example http://danielweitzenfeld.github.io/passtheroc/blog/2014/10/28/bayes-premier-league/
View entire conversation on ReviewNB |
I agree if we eventually add the code it should probably have some explanation or link to a description of pm.Potential, as I said above though, I am not sure about this being inside the scope of the notebook. View entire conversation on ReviewNB |
I think the second bullet point is a little confusing/imprecise: what we are doing here is assessing the predictive accuracy of predicting the outcome of the whole match. If our ultimate goal is the one above (going to matches that have the most probability of ending 3-3 or 4-4) this strategy would be the one to use in order to get the predictive accuracy of the desired "observation" (if we had several models, we would compare them based on these loo values instead of away team predictive accuracy). But there are other cases where the interest lies in the whole outcome of the match, the 3-3 case is only one example. Other silly examples could be: wanting to go to the dullest match because you don't care much about the match and want to be able to talk during it (maybe you are bringing your significant other); having some kind of fetish for matches that end up 4-1; predicting the whole outcome of matches to bet on them...
In the same betting example, you may realize there is more money in guessing the goals of the away team than the goals of the home team and therefore you'd compare models with the away goals metric (like the supporters with low budget). I don't want to use betting examples though. View entire conversation on ReviewNB |
View / edit / reply to this conversation on ReviewNB canyon289 commented on 2020-06-14T14:29:36Z Reading through all the text you've added here and so far so good! Readable and understandable! |
Ow ok, outside of this NB context you mean. Maybe we can add a link to this NB in the error message, once the NB is merged?
View entire conversation on ReviewNB |
Diagrams are a great idea! And it's true that if we add diagrams, then the other implementations of the model are less necessary (although interesting to keep somewhere). That'd be great if we could hide/show cells in Jupyter NBs (like with the new View entire conversation on ReviewNB |
To be clear, I had already understood what you explicited above with the material already present in the NB. I like your point about changing the precise example, instead of taking the same as in the bullet point. That way, people will understand that 3-3 or 4-4 is just an example. Something like: "Here, the utility we're trying to maximize, as slightly nerdy football fans, is the pairs of goals -- matches that best fit our football tastes could be those that end up 4-4, or those with the lowest number of goals because we don't care much about the game and want to be able to talk during it, or those that end up with a precise score, like 6-2 (because we also love tennis)." View entire conversation on ReviewNB |
Here is a preview of the kind of diagrams I had in mind. I would include only the picture and not the code to generate it (i would then probably make a blog post with the code and diagram generation). (note the diagrams are for rugby dataset, not premier league) |
Good idea, that looks nice! And indeed, releasing the code in a subsequent blog post seems appropriate -- showing how to display this figure isn't the core of this NB. We'll have to explain how to interpret the diagram though. |
add diagrams remove alternative PyMC3 implementation add leave one team out draft description
I think I got the right skeleton now, only big structural change could be using |
@@ -0,0 +1,3438 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@OriolAbril I you are OK, I would like to merge this, and then work on adapting it to PyMC 5 and integrate it with the rest of the chapters. |
sounds good! I also want to generate an alternative model with a group level variable as covariate (e.g. the anual budget of the team) so we can actually make model comparison. The trickiest part might be having to run all 4 (or more) models every time we want to build the website though |
I have not used it yet, but Quarto has a freeze feature https://quarto.org/docs/projects/code-execution.html#freeze |
As discussed in #14, this PR adds a new subsection to model comparison section. It is based on the rugby model but applied to premier league data. I still have some doubts about the best way to organize the notebook and what to add to make it clear and accessible. Here are some of them, but feel free to add extra suggestions: