U4-9077 - SPIKE: Relations repositories does not use cache

Created by Lars-Erik Aabech 18 Oct 2016, 19:42:32 Updated by Lars-Erik Aabech 15 Nov 2016, 17:42:39

Tags: PR

Is duplicated by: U4-8942

Basically what the subject says.

There was a lot of text here, I removed it and PRed a new RepositoryFactory where the repos get the default CacheHelper. PR here https://github.com/umbraco/Umbraco-CMS/pull/1535

Spike: max 4 hours to look at this and discuss.

Comments

Lars-Erik Aabech 18 Oct 2016, 20:01:32

I dug around and found that the test doesn't use cache. Not sure how to go about that atm.

I'm sure it would be possible to do "oh, here's a list of DTOs, hydrate and cache all entities including these associations I left-joined now" instead of "oh, here's a list of DTOs, let me fetch each one once more, then each associated even tho I just did, and then possible cache them if all policies are good enough".

What's known in "enterprise level ORMs" as eager loading entities straight from the DB in one performant query...


Lars-Erik Aabech 18 Oct 2016, 20:26:46

So it basically boils down to this. :D

    public virtual IRelationRepository CreateRelationRepository(IDatabaseUnitOfWork uow)
    {
        return new RelationRepository(
            uow,
            _noCache, //never cache
            _logger, _sqlSyntax,
            CreateRelationTypeRepository(uow));
    }

    public virtual IRelationTypeRepository CreateRelationTypeRepository(IDatabaseUnitOfWork uow)
    {
        return new RelationTypeRepository(
            uow,
            _noCache, //never cache
            _logger, _sqlSyntax);
    }


Lars-Erik Aabech 19 Oct 2016, 18:16:57

Might give some perspective: https://github.com/lars-erik/umbraco-relation-editor/blob/master/Umbraco.RelationEditor/Extensions/PublishedContentExtensions.cs#L53


Umbraco 24 Oct 2016, 11:56:49

We'll have a look at if this is going to work in this sprint!


Lars-Erik Aabech 24 Oct 2016, 18:56:33

By the way, all the text was moderated and moved here:
https://our.umbraco.org/forum/contributing-to-umbraco-cms//80759-the-entire-repository-cache-mechanism
Still want to hear thoughts on the matter.


Stephan 29 Oct 2016, 14:44:16

@lars-erik just to clarify ... you wrote you'd PR a new repo factory, can you link to it? also, I see what you've done in your published content extensions... basically writing super-fast queries that query everything you need in one go, correct?

This second solutions seems the way to go for me. Caching is complex due to having to cleanup the cache, and since you're mentionning other ORMs, most of them have level-1 cache (you would reload relations on every request anyways) and require external tools for proper level-2 (shared amongst requests). In addition, there can be a large amount of relations, which we don't want to cache.

So... I'd rather say that relations come from the DB no matter what, but in an efficient way.

Thoughts?


Lars-Erik Aabech 30 Oct 2016, 18:56:10

@zpqrtbnk PR linked in description.

First of all, I did this as a reaction to a discussion on the umbracians slack where a helpful HQer recommended the RelationService as a performant and cached way to use relations. I argued it wasn't, and dug around for the proof. After swearing a lot not being able to "quickfix" an eager load, I ended up just caching them so the original tip would match the facts. I intended to do just that - "quickfix".

As you've noticed, I got upset by this previously and made a "hack" to eager load everything in the Relation Editor package. "Super fast, yet database query". :)

But to address this issue on its own, and on short term: You're saying relations wont be cached. The relation types at least should be cached. That way we cut the amount of queries in half. Whether it's possible to do a short term fix for the relations through the service, I don't know. The repositories have a kind of "closedness" that makes it harder to override the way a set is loaded without overriding too much expected inherited behavior. (caching for instance)

For the entire repo / query / cache discussion, I suggest we continue that in the our thread I created: https://our.umbraco.org/forum/contributing-to-umbraco-cms/80759-the-entire-repository-cache-mechanism Please ignore the named ORMs that surfaced. ;) The bottom line is that the current repository base class forces the n*x+1 query style for any content through the services, and they would surely benefit from the ability to use eager loading strategies. (At least not repeat query for each instance) Which would be the long term solution for this issue too.


Stephan 31 Oct 2016, 07:40:42

Was that PR link there yesterday? If so... must have been ''very'' tired :(

  • Caching relation types = definitively YES.
  • Caching relations = real caching would need to be MRU w/ capacity cap (hard), now need to look into your PR.
  • Having more open repos + less N+1 querying = all for it, prob v8 material.

As for ORMs... well, any ORM can be mis-used ;-) Umbraco is going NPoco with v8, the key being to use it correctly. Umbraco is not such a "big" app, db-wise, and we'd probably favor a small, efficient ORM that we can use correctly, over a large one that we spend time trying to configure. Similar to StackOverflow approach. So, no, no name discussion, only: can we use it better.


Lars-Erik Aabech 31 Oct 2016, 07:47:25

No, added PR yesterday. But I always make PRs with the same name as the issue number, so figured it wasn't needed. I could update the PR to just cache the types, but it's only a matter of resetting the one line to NoCache again after merge / close / throw away / whatever. ;)

We agree on the long term solution. Would be nice to discuss further. I'll dig up a blog post describing a good solution and add it to the our thread. (post hereby found and shared here: https://our.umbraco.org/forum/contributing-to-umbraco-cms/80759-the-entire-repository-cache-mechanism)


Stephan 09 Nov 2016, 10:53:51

Reviewing the PR. It's a one-liner really, just assigning a true cache helper to the relation type repository instead of a null cache (relations themselves are left unchanged). However... we need to ensure that the cache would be refreshed. In progress.


Lars-Erik Aabech 09 Nov 2016, 11:01:01

I assumed that was handled in the base repo class. If not, there's another thing to generalize. ;)


Stephan 09 Nov 2016, 12:20:13

Core PR: https://github.com/umbraco/Umbraco-CMS/pull/1614

merges the origin PR + changes to cache all relation types at once, and manage the distributed cache, refreshers and all.

review/test: should be possible to create, delete relation types. should sync/refresh the cache in a LB environment (ie delete on 1 server, refresh on another, should have been updated with the delete).

closing the original PR (but it's merged into this new one anyways)


Stephan 09 Nov 2016, 12:21:03

@lars-erik if you have a moment to test that'd be super cool


Lars-Erik Aabech 09 Nov 2016, 12:47:03

Looks good. With 4 relations and 2 types, I now have 6 queries on first load. Second load is down to 5 queries. Updating relation type invalidates, so back to 6 queries. Subsequent back to 5. So we're down to n+1 queries instead of n*2. :)

(didn't test LB, tho)


Stephan 09 Nov 2016, 13:10:59

Not good enough. Let me check.


Lars-Erik Aabech 09 Nov 2016, 13:13:50

As far as I can see it's right. It fetches all relations (1 query), then each of them again (4 queries). Only thing you might think wrong is the invalidated relation type that could've been cached on save.


Stephan 09 Nov 2016, 13:57:19

ah so the types cache is working correctly but we'd still have a N+1 when loading relations... looking into it...


Lars-Erik Aabech 09 Nov 2016, 14:00:29

Exactly. And that's what's hard to change unless you do a custom Relation repo. Which lead me to that other post on Our. ;)


Stephan 09 Nov 2016, 14:33:58

OK, have updated the PR, and the N+1 should be gone by now. Have a minute to test?


Lars-Erik Aabech 09 Nov 2016, 14:46:17

Sah-weet! :) 2 queries at boot, 1 every refresh. Full reload of types when 1 change, but that's fine.

Looked at the code. Not that much after all. I guess it helps knowing it as one's own pocket. Letting it brew in the mind's basement to see if some strategy pattern can be introduced.


Lars-Erik Aabech 09 Nov 2016, 14:46:29

Great job! #h5yr.


Stephan 09 Nov 2016, 15:44:37

To be reviewed & merged!


Lars-Erik Aabech 15 Nov 2016, 17:42:39

\o/


Priority: Normal

Type: Performance Problem

State: Fixed

Assignee:

Difficulty: Normal

Category:

Backwards Compatible: True

Fix Submitted: Pull request

Affected versions: 7.5.3

Due in version: 7.5.5

Sprint: Sprint 46

Story Points: 1

Cycle: