Hugo如何维护站点范围内的数据,例如.Site.AllPages?

I'm looking for some bite-sized examples on how Hugo might be managing site-wide data, like Site.AllPages.

Specifically, Hugo seems too fast to be reading in every file and it's metadata, before beginning to generate pages and making things like .Site.AllPages available -- but obviously that has to be the case.

Are Ruby (Jekyll) and Python (Pelican) really just that slow, or is there some specific (algorithmic) method that Hugo employs to generate pages before everything is ready?

There is no magic, and Hugo does not start any rendering until the .Site.Pages etc. collections are filled and ready.

Some key points here:

  • We have a processing pipeline where we do concurrent processing whenever we can, so your CPUs should be pretty busy.
  • Whenever we do content manipulation (shortcodes, emojis etc.), you will most likely see a hand crafted parser or replacement function that is built for speed.
  • We really care about the "being fast" part, so we have a solid set of benchmarks to reveal any performance regressions.
  • Hugo is built with Go -- which is really fast, and have a really great set of tools for this (pprof, benchmark support etc.)

Some other points that makes the hugo server variant even faster than the regular hugo build:

  • Hugo uses a virtual file system, and we render directly to memory when in server/development mode.
  • We have some partial reloading logic in there. So, even if we render everything every time, we try to reload and rebuild only the content files that have changed and we don't reload/rebuild templates if it is a content change etc.

I'm bep on GitHub, the main developer on Hugo.

You can see AllPages in hugolib/page_collections.go.

A git blame shows that it was modified in Sept. 2016 for Hugo v0.18 in commit 698b994, in order to fix PR 2297 Fix Node vs Page.

That PR references the discussion/improvement proposal "Node improvements"

Most of the "problems" with this gets much easier once we agree that a page is just a page that is just a ... page...

And that a Node is just a Page with a discriminator.

So:

  • Today's pages are Page with discriminator "page"
  • Homepage is Page with discriminator "home" or whatever
  • Taxonomies are Pages with discriminator "taxonomy"
  • ...

They have some structural differences (pagination etc.), but they are basically just pages.

With that in mind we can put them all in one collection and add query filters on discriminator:

  • .Site.Pages: filtered by discriminator = 'page'
    *.Site.All: No filter
  • where: when the sequence is Pages add discriminator = 'page', but let user override

That key (the discriminator) allows to retrieve quickly all 'pages'.