4 Tips for Optimizing Incremental Sitecore Content Publishing

Sitecore 122For the past few years, I have not been a big fan of incremental publishing. In fact, I’ve often recommended if there was one publishing mode to never use, it was incremental publishing. I didn’t know all the facts, but the truth was that every time I tried to implement a solution with incremental publishing the content wouldn’t publish correctly.

Over time, Sitecore has been tweaking incremental publishing and making it better, and some of the issues (like handling item renames) have been fixed in 7.2.  However, if you’re still on one of the other releases, there are a few tweaks you can make to enhance your incremental publishing.

#1. Use a scheduled publish agent

By default, the scheduled PublishAgent is disabled and will not run. Users must manually publish. However, if you plan on implementing a solution with incremental publishing, I recommend moving to a scheduled publishing model.  When a user tries to perform an incremental publish, their security access is taken into account and if they do not have access to an item it will not be published. However, the future incremental publishes will not attempt to republish it, because it will run based on the last publish date and will not take into account the skipped item.  This will lead to missing items in the published database.

Instead, enable the scheduled PublishAgent by specifying an interval for publishing. With the scheduled task handling the publishing, the security context is taken out of the picture. You can enable the agent using a config file in the App_Config\Include folder that changes the interval.  The below example  updates the interval to run every 5 minutes:

<?xml version="1.0" encoding="utf-8" ?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:set="http://www.sitecore.net/xmlconfig/set/">
	<sitecore>
		<scheduling>
			<agent type="Sitecore.Tasks.PublishAgent" set:interval="00:05:00">
				<param desc="languages">en</param>
			</agent>
		</scheduling>
	</sitecore>
</configuration>

#2. Only publish if there are items queued for processing

The default PublishAgent will execute a publish action for all configured languages at each interval. However, if you are running incremental publishing, there may be no new items to publish. If you use the default PublishAgent class this agent will still execute a publish job, even though the queue is empty. This leads to unnecessary processing and publish event triggers.

Instead, override the default PublishAgent with your own implementation that checks the current queue if the publishing mode is incremental. You can get the current queue using the following Sitecore API call:

var queue = PublishQueue.GetPublishQueue(publishOptions);

Just check the queue length to make sure the publishing only happens if there is something that could be published!

#3. Only clear the cache when items are published

If you are running a single-instance site where authoring and delivery are on the same Sitecore installation, you can further optimize publishing by creating a custom HtmlCacheClearer. The queue that is used by the incremental publish is a list of publishing candidates.  Sometimes, these candidates are not in a state yet where they can be published. For example, a new item version in a draft state will appear in the queue but will not publish because it is not in the last state of the workflow. Even with an optimized PublishAgent implementation, this will trigger an incremental publish and the publish events will fire.  This will lead to a cache clear by the HtmlCacheClearer that is registered to the publish:end and publish:end:remote events.

In this scenario, the HTML Cache will be cleared unnecessarily because publish events are occurring even though nothing is actually being published.

Instead, you can replace the default HtmlCacheClearer with your own implementation that will only fire the cache clearing if something was actually published. The publish:end event handlers receive a SitecoreEventArgs parameter which contains the Publisher which fired the event. From the Publisher, you can extract the Job and the job status. From here, you will find the publishing statistics in the messages from the Publisher.

	//Get the Publisher
	Publisher publisher = (Publisher)args.Parameters.FirstOrDefault(x => x is Publisher);

	//Get the job
	Job job = JobManager.GetJob(publisher.GetJobName());

	//Extract the messages and total processed
	var messages = new List();
	messages.AddRange(job.Status.Messages.Cast());
	long processed = job.Status.Processed;

	//In the messages, determine the number skipped.
	var skippedMessage = messages.FirstOrDefault(x => x != null && x.Contains("skipped"));
	int skippedCount = GetCountFromMessage(skippedMessage);

	//Only fail if all items processed were skipped
	return skippedCount != processed;

By using these messages, you can determine if the number of items skipped matches to the total count of items processed. If so, that means nothing was changed, and you can safely skip the cache clear!

#4. Test the Candidates in the Publishing Queue

While the HtmlCacheClearer optimization works for single-instance installations, the PublishRemoteEventArgs do not have the statistics information needed to optimize the cache clearing. This means that in a distributed model where Content Delivery instances are separated from the publishing instance, the HTML cache will always get cleared every time the publish agent triggers a publishing job.

This is bad. 😉

The only way to prevent this is to minimize the number of unnecessary publishing jobs that get triggered. The Publish Item pipeline has a lot of code in it that determines whether to publish the version or skip it, but the code is not available from outside the pipeline classes.

You’ll likely have to use a reflector (like dotPeek) and grab the code yourself. The Sitecore methods that are the most helpful are:

  • PublishHelper.IsSameRevision
  • PublishHelper.CompareItemsWithoutRevision
  • PublishHelper.CompareClonedFields
  • PublishHelper.CompareNotVersionedFields

Once you’ve created your testing class that can determine if an Item will need to be published, you can enhance the custom PublishAgent you created to test each candidate in the publishing queue.

Previously, the agent was only submitting a publish job if the queue was not empty. Now, it will only submit the publish job if the queue is not empty and at least one of the candidates in the queue would trigger a publish. Something like the following will do the trick:

	//Check all items in queue if they can publish
	var queue = PublishQueue.GetPublishQueue(publishOptions);
	bool hasItemsToPublish = (queue != null) && queue.Any(PublishTester.IsPublishable);

In the above example, PublishTester is a class that contains whatever utility methods you need to determine if a candidate item can publish.

Some things to watch out for

Once you switch to a scheduled publishing model, there are a few things you’ll need to be careful of:

  1. Shared layouts field: By default, Sitecore ships with the presentation details stored in a shared Layouts field. This means that if you edit a new version of a page and start changing the layout (like adding a component to the page) it applies to all versions, including previously approved ones. The scheduled publisher will pick these changes up and push them out, even though the new version hasn’t been approved yet. You may need to look at unsharing the Layouts field.
  2. Logging: Since the publishing agent will sometimes not be triggering a job, you might want to make sure you have some extra logging if the agent decides to skip a publish so that you can debug why a publish action didn’t occur.

With the improvements in Sitecore 7.2, I’m hopeful we may never need to run an incremental publish again, especially with the speed efficiencies of parallel publishing.

Have you run into any issues with incremental publishing? Actions the incremental publish just doesn’t pick up correctly? How did you solve it?

  

Advertisements

One thought on “4 Tips for Optimizing Incremental Sitecore Content Publishing”

  1. One thing to note is that serialization of items will not add them to the publish queue (from what I recall).
    So, say you’re adding content items in a deployment (through a package, TDS, Unicorn or whatever), those items won’t be added to the publish queue, and so won’t be published by the scheduling agent.

    This was just one of the things I discovered when looking into the Publish Queue and Incremental publishing in the past.
    http://www.seanholmesby.com/sitecore-publish-queue-and-incremental-publish/

    Great post by the way. I’m yet to look into some of Sitecore 7.2’s publishing enhancements.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s