Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove recursive logic for object relation attributes metaData function #55

Merged
merged 3 commits into from
Sep 6, 2017

Conversation

pkamps
Copy link
Member

@pkamps pkamps commented Apr 10, 2017

Each datatype has the function metaData(). It's used by the search engine in order to get attribute content you want to put in to the search index. For ezstring is it just the string that is stored in the attribute.

It's more complex for an object relation attribute (ezobjectrelation and ezobjectrelationlist). In that case, the metaData function is looking up the related object and collects all searchable data on the related content object.

So if you have an article with an object relation attribute to configure related articles, the metaData for that attribute will collect all data from the related article.
Now let's assume the related article also has an object relation attribute to an image object. Again, the metaData function will lookup the related object and collect all data from the image data map.
That recursive logic will follow all object relations and puts all collected data into the search index for one single attribute.

This pull request is removing the recursive logic. For an article object relation attribute, it will only collect all data of the related article. Object relations on the related article are ignored (like the image object in the previous example).

With this pull request, we will only index closely related data. It also avoids a problem the upstream version currently has: ezsystems#1288

Technical details:
I moved the function 'metaDataArray' from ezcontentobjectattribute to ezobjectrelationlisttype. That function is very specific to the ezobjectrelationlisttype and ezobjectrelationtype datatype. Only those 2 classes are calling the function. That's why I believe it does not belong to ezcontentobjectattribute.

I removed the recursion protection from ezobjectrelationlisttype and ezobjectrelationtype datatypes (function metaData). It's not needed when there is no recursion.

@peterkeung
Copy link
Member

I support the general idea of stopping the indexing from going too deep. We've had to create custom indexing classes to overcome this behavior, which was affecting search relevance in a bad way. Needing to "reach deeper" is not the default use case.

I think this also helps in the edge cases where you want to have the same object related multiple times in the same attribute -- but that would be for Mugo Object Relations: https://app.assembla.com/spaces/mugo-development-general/tickets/433

  • Is there any precedent for a datatype to rely on another datatype? It's logically odd to me that eZObjectRelation now calls in code from eZObjectRelationList

@benkmugo
Copy link
Member

I agree with you both here.
Having closely related data present in the index for the indexed item is important and we make use of that on a lot of our sites, especially the book/ONIX related ones.

I cannot however see a good use case where data more than 1 relation 'deep' should also be present by default.

That gets at what Peter mentioned in that a use case, where that is required for whatever reason, should have to enable this kind of behaviour (however that may be).

eZFind is fast, but keeping the index from bloating by restricting the related data indexed would be a good step to stop this from becoming a problem in the first place.

@pkamps
Copy link
Member Author

pkamps commented Apr 11, 2017

About where to put metaDataArray function

I agree with Peter, it's not great to have the dependency for a datatype to another datatype. But the source code bundle comes with all datatypes always - so it's not very risky to have the dependency.

What are my alternative options:

  1. Put the code back into ezcontentobjectattribute.
  2. Duplicate the code and have the metaDataArray() in both datatypes
  3. Add a new parent class for both ezobjectrelationlisttype and ezobjectrelationtype datatype. That parent class extends eZDataType and contains common code for both datatypes.

My evaluation about the alternative options:

  1. That function really does not belong to ezcontentobjectattribute. Especially with my changes from this pull request, the function is really tailored to the object relation datatypes.
  2. A valid option. Just a bit harder to maintain.
  3. Probably the cleanest option. I imagine there is quite a few code sections valid for both datatypes. A common parent class would be the best spot to put it. But it's quite a bit change in the class structure and I'm not sure where to put that parent class (maybe in the same folder as the eZDataType file is)

I think long term goal should be to have a single attribute for object relations. And allow to set a limit for the object relations. Then there is no need to have 2 separate datatypes. With this background, I would consider option 2. or the current implementation.

@benkmugo
Copy link
Member

I'd vote for 3. from your list of options, but I don't know how much effort or code changes that would require (risk of breaking stuff).
So perhaps 2. is best short term option, despite duplicate code.

@peterkeung
Copy link
Member

Yes, I prefer option 2

@pkamps
Copy link
Member Author

pkamps commented Apr 12, 2017

The metaDataArray function is now on both datatypes which drops the dependency between the datatypes.

@pkamps
Copy link
Member Author

pkamps commented May 11, 2017

Needs testing

@ernestob
Copy link
Member

ernestob commented Sep 5, 2017

This is working fine.

Before applying this patch, the results included the content from other related content and their relations, eg.:

Found object content -> Related object content -> Related object content ...

After applying this patch
The metadata content only looks relations one level depth, eg.:

Found object -> Related object content

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants