RecursiveMethod.UmbracoXmlParser 1.1.4

There is a newer version of this package available.
See the version list below for details.
dotnet add package RecursiveMethod.UmbracoXmlParser --version 1.1.4
NuGet\Install-Package RecursiveMethod.UmbracoXmlParser -Version 1.1.4
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="RecursiveMethod.UmbracoXmlParser" Version="1.1.4" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add RecursiveMethod.UmbracoXmlParser --version 1.1.4
#r "nuget: RecursiveMethod.UmbracoXmlParser, 1.1.4"
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install RecursiveMethod.UmbracoXmlParser as a Cake Addin
#addin nuget:?package=RecursiveMethod.UmbracoXmlParser&version=1.1.4

// Install RecursiveMethod.UmbracoXmlParser as a Cake Tool
#tool nuget:?package=RecursiveMethod.UmbracoXmlParser&version=1.1.4

Fixed this version

v1.1.4 - Fix for UmbracoParsingOptions.DoctypeMapping - wasn't correctly mapping doctypes in 1.1.3 and lower.

Umbraco XML Parser

This repository contains code for a NuGet package that allows you to easily parse the Umbraco v4/v6/v7 XML cache file umbraco.config.

As of version 1.1.0-beta and later, support is also available for the Umbraco v8.0.1 (and later) NuCache DB format (eg. App_Data\TEMP\NuCache\NuCache.Content.db). Due to a binary format change, support for Umbraco v8.0.0 is not available.

These caches contains all published content and property data and can be used programmatically for any of the following purposes:

  • Content migration (eg. migrate all published content elsewhere)
  • Content analysis (eg. how many articles do not have a populated meta description property?)
  • Content reporting (eg. how many published articles do we have this month?)

As of version 1.0.2 and later, Umbraco XML Parser understands the commonly used umbracoUrlAlias and umbracoUrlName elements to set the URL of the node in question. Note if the umbracoUrlAlias contains multiple aliases (comma separated), then only the first will be used as the URL.

Getting

Pull the package with nuget: install-package RecursiveMethod.UmbracoXmlParser

Using with Umbraco v4/v6/v7

Some sample LINQPad scripts.

Count all nodes of a certain doctype ("Article"):
var parser = new UmbracoXmlParser("umbraco.config");
var articleCount = parser.GetNodes().Where(n => n.Doctype == "Article").Count();
articleCount.Dump();
Dump all articles or reviews that do not have a meta description populated
var parser = new UmbracoXmlParser("umbraco.config");
var articles = parser.GetNodes().Where(n => (n.Doctype == "Article" || n.Doctype == "Review") &&
                                             string.IsNullOrWhiteSpace(n.GetPropertyAsString("metaDescription")));
articles.Dump();
Dump all nodes in the site with their node ID and URL, using a specific domain for the root level node (1069 in this case):
var parser = new UmbracoXmlParser("umbraco.config", new UmbracoParsingOptions
{
    UrlPrefixMapping = new Dictionary<int, string> { { 1069, "https://www.examplesite.com.au" } }
});
var articles = parser.GetNodes().Select(n => new { NodeId = n.Id, Url = n.Url });
articles.Dump();

Using with Umbraco v8.0.1 and later

Umbraco v8's NuCache format support is in beta. UmbracoXmlParser only includes support for the first property value and does not yet support getting property values for Segments or Cultures. The NuCache binary format itself does not include Document Type aliases nor usernames for creators or writers. But you can supply them using UmbracoParserOptions:

Count all nodes of a certain doctype ("Article"):
var parser = new UmbracoXmlParser("NuCache.content.db", new UmbracoParsingOptions
{
    DoctypeMapping = new Dictionary<int, string>
    {
        { 1095, "Article" },
        { 1096, "Review" }
    }
});
var articleCount = parser.GetNodes().Where(n => n.Doctype == "Article").Count();
articleCount.Dump();
Dump all articles or reviews that do not have a meta description populated
var parser = new UmbracoXmlParser("NuCache.content.db", new UmbracoParsingOptions
{
    DoctypeMapping = new Dictionary<int, string>
    {
        { 1095, "Article" },
        { 1096, "Review" }
    },
    UserMapping = new Dictionary<int, string>
    {
        { -1, "admin" },
        { 1278, "adam" }
    }
});
var articles = parser.GetNodes().Where(n => (n.Doctype == "Article" || n.Doctype == "Review") &&
                                             string.IsNullOrWhiteSpace(n.GetPropertyAsString("metaDescription")));
articles.Dump();
Dump all nodes in the site with their node UID, ID and URL, using a specific domain for the root level node (1095 in this case):
var parser = new UmbracoXmlParser("NuCache.content.db", new UmbracoParsingOptions
{
    UrlPrefixMapping = new Dictionary<int, string> { { 1095, "https://www.examplesite.com.au" } }
});
var articles = parser.GetNodes().Select(n => new { Uid = n.Uid, NodeId = n.Id, Url = n.Url });
articles.Dump();

Full docs

See https://github.com/spudstuff/umbraco-xml-parser

Product Compatible and additional computed target framework versions.
.NET Framework net45 is compatible.  net451 was computed.  net452 was computed.  net46 was computed.  net461 was computed.  net462 was computed.  net463 was computed.  net47 was computed.  net471 was computed.  net472 was computed.  net48 was computed.  net481 was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
1.1.5 417 8/22/2021
1.1.4 438 5/1/2021
1.1.3 431 11/17/2020
1.1.0-beta 415 6/26/2019
1.0.4 1,286 6/5/2017
1.0.3 948 5/31/2017
1.0.2 964 5/28/2017
1.0.1 929 5/28/2017
1.0.0 965 5/25/2017