<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="article.xsl"?> 

<articles>
  <article date="23 Oct 2019">
    <pagetitle>XML Parsing Error: Not Well Formed</pagetitle>
    <articleheader>Challenges when building a website using <span class="danger">XML and XSLT</span></articleheader>
    <articleabstract>I don't regret building this site with <span class="danger">XML + XSLT</span>, but there are some obstacles that I didn't anticipate</articleabstract>
    <articlebody>

<p>When I started this site, I did so with the intention of tracking my video game collection. It's grown since then<sup class="inlinefootnote">1</sup> into whatever it is now. I don't do any SEO, I don't do any advertising, I don't even do much word of mouth promoting of this thing because the goal wasn't to make something that generated a lot of traffic, or any traffic, really. The goal was just to track my game collection, and that was it. And if someone accidentally stumbled on my list of NES games, well, okay, but there's not really all that much that they can do with that.</p>

<p>But, then I started having ideas for other things that I could do. I remembered learning about XML in my school days, and I started cooking up ways that I could use that and refresh my memory on how <span class="danger">XSLT</span> worked, and even though browser support is stuck in 1999 for some reason<sup class="inlinefootnote">2</sup>, there were some interesting things that could be done, and doing those things will let me learn a little bit more about <span class="danger">XML and XSLT</span>.</p>

<p>If you search around the Internet for advice about <span class="danger">XML and XSLT</span>, you find a lot of programmers absolutely hate it for some reason. It's really easy to find some neophyte looking to start working on a website, and thinking that <span class="danger">XML and XSLT</span> is a good choice, and then you'll find seventy billion replies telling the original person: no, don't do that; no, you're in for a headache; no, it's a bad idea; no, no, no; and so on. Most of these programmers are coming at this from a programming perspective. They go into it thinking that <span class="danger">XSLT</span> is a full-on programming language (and it kind of is, but, as far as I can tell, is a bad fit as a general programming language), and they try to make <span class="danger">XSLT</span> do things that a full-fledged programming language makes easy, and they run into headaches. </p>

<p>But <span class="danger">XML and XSLT</span> isn't like that, at least not in the way I use it here. <span class="danger">XSLT</span> is just a way to transform an XML document into some other kind of document, and I've had a lot of luck using it for the simple use cases here.</p>

<p>Part of the problem is that support for <span class="danger">XSLT in web browsers</span> is stuck in 1999 with <span class="danger">XSLT 1.0</span>, and even that support is not complete. There are a lot of quality of life improvements that you just can't use if you're using <span class="danger">client-side XSLT transformations</span>, and that is not going to change any time soon. But that doesn't mean that <span class="danger">XML and XSLT</span> development stopped in 1999. Those technologies were in active(ish) development and are up to version 3.0 as of 2017 or so<sup class="inlinefootnote">3</sup>. If you want to take advantage of these new versions with all their whiz-bang new feaures, you have to do server-side processing with something like Saxon <span class="danger">XSLT</span>. I don't want to do that.</p>

<p>I wanted to make something simple. Since this project was originally just for my own benefit, I chose something that I was already kind of familiar with and massaged it into doing the job that I wanted it to do. In this case it was to take an arbitrary list of things, put them in a nice layout, and sort them so I could find them easily. And I've done that. I also extended that idea a little bit to be able to present articles, and I've got some other projects in the works that may or may not see the light of day, but are useful for learning how to do the things that I want to do. </p>

<p>And, eventually, I decided that if I'm writing articles, well, maybe someone other than me might want to read one. So, I decided to see if I could get this site indexed by a search engine or two and throw my hat into the ring. This also turned out to be more difficult than I expected.</p>

<p>I started off with <span class="danger">Google</span>. I managed to get the site submitted, but they only crawled the index page. It seems that <span class="danger">Googlebot</span> doesn't understand <span class="danger">XSLT</span>, and the cached page just shows the unmarked-up text of the main index page, and nothing else. I ended up creating a <a href="http://wyrm.org/sitemap.xml">sitemap</a> and an <a href="http://wyrm.org/rss.xml">RSS feed</a>, and <span class="danger">Googlebot</span> still doesn't seem to be able to find them and/or is unwilling to crawl the site again. I added a <a href="http://wyrm.org/robots.txt">robots.txt</a> file with the location of the sitemap, and it still hasn't crawled this site in over three months (as of this writing, the last time this site was 'crawled' by <span class="danger">Googlebot</span> was July 2019), but at least it shows up in the search results if you type in 'wyrm.org'.</p>

<p>I tried to get this site listed in duckduckgo, but that's been an exercise in frustration. DDG's help page says that they don't take just plain old submissions because DDG finds sites automatically, and there's no reason to submit a particular URL. One of the sources they use is Bing, so I used their Webmaster Tools to jump through their hoops and I still haven't gotten anywhere. It's possible (even probable) that this site doesn't have the kind of content that might show up in searches or would even be of interest to anyone except me. And I know that it's not the regular kind of <span class="danger">Google-SEO-Optimized junk</span> that has overrun the internet, so maybe that has something to do with it.</p>

<p>So, it's annoying that I can't get search engines to crawl this site correctly, and there's a constant threat that support for the technology that I chose will get removed in the next version of major browsers. But those are minor concerns. They're minor concerns because I didn't make this site to make money. I didn't make this site to show up at the top of the search rankings. I didn't make this site to be useful to anyone else but me. And, as long as this site is useful to me, I'll keep working on it in the way I want to.</p>

<p>But what if you have an idea for a site. And you think that <span class="danger">XML and XSLT</span> would be a good fit. Would I recommend that you use <span class="danger">XML and XSLT</span>? No. Would I recommend that you use something else? Also no. I recommend using whatever you want for it because, in the end, the tool you use doesn't really matter all that much. All that really matters is the result, and you can get a good result with nearly any tool as long as you take the time to learn that tool.</p>


</articlebody>
    <footnotes>
      <footnote>As these things tend to do</footnote>
      <footnote>Which is completely insane, but that's something we can discuss in another article.</footnote>
      <footnote>I suspect that the only reason that it gets any development at all these days is because Michael Kay uses it as an excuse to develop and sell new versions of Saxon XSLT. It's a genius self-perpetuating cycle</footnote>
    </footnotes>
  </article>
</articles>
