Challenges when building a website using XML and XSLT

I don't regret building this site with XML + XSLT, but there are some obstacles that I didn't anticipate

When I started this site, I did so with the intention of tracking my video game collection. It's grown since then1 into whatever it is now. I don't do any SEO, I don't do any advertising, I don't even do much word of mouth promoting of this thing because the goal wasn't to make something that generated a lot of traffic, or any traffic, really. The goal was just to track my game collection, and that was it. And if someone accidentally stumbled on my list of NES games, well, okay, but there's not really all that much that they can do with that.

But, then I started having ideas for other things that I could do. I remembered learning about XML in my school days, and I started cooking up ways that I could use that and refresh my memory on how XSLT worked, and even though browser support is stuck in 1999 for some reason2, there were some interesting things that could be done, and doing those things will let me learn a little bit more about XML and XSLT.

If you search around the Internet for advice about XML and XSLT, you find a lot of programmers absolutely hate it for some reason. It's really easy to find some neophyte looking to start working on a website, and thinking that XML and XSLT is a good choice, and then you'll find seventy billion replies telling the original person: no, don't do that; no, you're in for a headache; no, it's a bad idea; no, no, no; and so on. Most of these programmers are coming at this from a programming perspective. They go into it thinking that XSLT is a full-on programming language (and it kind of is, but, as far as I can tell, is a bad fit as a general programming language), and they try to make XSLT do things that a full-fledged programming language makes easy, and they run into headaches.

But XML and XSLT isn't like that, at least not in the way I use it here. XSLT is just a way to transform an XML document into some other kind of document, and I've had a lot of luck using it for the simple use cases here.

Part of the problem is that support for XSLT in web browsers is stuck in 1999 with XSLT 1.0, and even that support is not complete. There are a lot of quality of life improvements that you just can't use if you're using client-side XSLT transformations, and that is not going to change any time soon. But that doesn't mean that XML and XSLT development stopped in 1999. Those technologies were in active(ish) development and are up to version 3.0 as of 2017 or so3. If you want to take advantage of these new versions with all their whiz-bang new feaures, you have to do server-side processing with something like Saxon XSLT. I don't want to do that.

I wanted to make something simple. Since this project was originally just for my own benefit, I chose something that I was already kind of familiar with and massaged it into doing the job that I wanted it to do. In this case it was to take an arbitrary list of things, put them in a nice layout, and sort them so I could find them easily. And I've done that. I also extended that idea a little bit to be able to present articles, and I've got some other projects in the works that may or may not see the light of day, but are useful for learning how to do the things that I want to do.

And, eventually, I decided that if I'm writing articles, well, maybe someone other than me might want to read one. So, I decided to see if I could get this site indexed by a search engine or two and throw my hat into the ring. This also turned out to be more difficult than I expected.

I started off with Google. I managed to get the site submitted, but they only crawled the index page. It seems that Googlebot doesn't understand XSLT, and the cached page just shows the unmarked-up text of the main index page, and nothing else. I ended up creating a sitemap and an RSS feed, and Googlebot still doesn't seem to be able to find them and/or is unwilling to crawl the site again. I added a robots.txt file with the location of the sitemap, and it still hasn't crawled this site in over three months (as of this writing, the last time this site was 'crawled' by Googlebot was July 2019), but at least it shows up in the search results if you type in 'wyrm.org'.

I tried to get this site listed in duckduckgo, but that's been an exercise in frustration. DDG's help page says that they don't take just plain old submissions because DDG finds sites automatically, and there's no reason to submit a particular URL. One of the sources they use is Bing, so I used their Webmaster Tools to jump through their hoops and I still haven't gotten anywhere. It's possible (even probable) that this site doesn't have the kind of content that might show up in searches or would even be of interest to anyone except me. And I know that it's not the regular kind of Google-SEO-Optimized junk that has overrun the internet, so maybe that has something to do with it.

So, it's annoying that I can't get search engines to crawl this site correctly, and there's a constant threat that support for the technology that I chose will get removed in the next version of major browsers. But those are minor concerns. They're minor concerns because I didn't make this site to make money. I didn't make this site to show up at the top of the search rankings. I didn't make this site to be useful to anyone else but me. And, as long as this site is useful to me, I'll keep working on it in the way I want to.

But what if you have an idea for a site. And you think that XML and XSLT would be a good fit. Would I recommend that you use XML and XSLT? No. Would I recommend that you use something else? Also no. I recommend using whatever you want for it because, in the end, the tool you use doesn't really matter all that much. All that really matters is the result, and you can get a good result with nearly any tool as long as you take the time to learn that tool.

Footnotes

  1. As these things tend to do
  2. Which is completely insane, but that's something we can discuss in another article.
  3. I suspect that the only reason that it gets any development at all these days is because Michael Kay uses it as an excuse to develop and sell new versions of Saxon XSLT. It's a genius self-perpetuating cycle


Read more articles ยท Go back to the homepage