By now it should be known we’re notorious users of SAP MaxDB in a “non-SAP environment”, and, for that matter, we have done rather well throughout the last seven years. By now, we gained quite some experience running, administering, working with that RDBMS in our environment, and we manage to get our production work done on top of it without thinking about it all too much which seems a good thing. However, there are several nuissances about that platform, both talking about political SAP product and licensing decisions and about overall technical issues, lack of support for most of the neat tools, toys and frameworks included.
I’ve already been writing about one of Toby Segarans books in the past. Remembering these days, I have been pretty enthusiastic regarding both his style of writing and his style of providing people with knowledge. It’s nice seeing this happen again also dealing with another book of his, “Programming the semantic web”.
At the moment, for various reasons, I am into refreshing some of my knowledge related to ontologies, RDF, inference, reasoning and most of the technologies and concepts related to what usually is referred to as “semantic web”. I was searching for a quick guide on things, ideally a “hands-on” one covering one of the technologies we’re already using. Considering this a “match” as far as the latter is concerned was just a couple of moments: Same as the “Collective Intelligence” book, the “Semantic Web” one provides an extensive amount of sample code implemented in Python, which is fine for us as we use both Python and Jython in a production environment and I’m quite familiar with both. So, feels like home from this point of view, even though “Semantic Web” is less “Python only” than its predecessor.
Asides this, in terms of being “hands-on”, the book is rather good as well. Also following the same style “Collective Intelligence” did, the “Semantic Web” book provides a vast load of step-by-step samples to try out all the concepts and approaches introduced throughout the course. Again, the authors provide a whole load of well-prepared sample data, mainly in .csv and .txt, on the books web site, and, again, I am amazed by the refreshingly pragmatic way of hacking things up to make them initially work, to give the reader something to play with, something to figure out how things go, and then eventually to dive deeper into what happens, extending and optimizing here and there. Python surely has a sweet spot here in its interactive mode, allowing for trying out all the examples almost immediately without having to bother with too big a stack of technology to be mastered before getting anything to run (which, unfortunately, seems the case once in a while in the Java+IDE+Application-Server – world). In some ways, despite enjoying solving “real-world” problems using Java, I quite often thoroughly admire the refreshingly straightforward approach to doing things in Python, and this book to me once again points out why Python is quite a good learning language, too: In case of most of the explanations, it doesn’t seem a long way to go from a more or less formal description of a problem to a working prototype implementation that can be used on a sample dataset, which is technically the definite strength of this book.
As far as the non-technical things go, it is obvious that Segaran and his co-authors do have a profound experience in explaining complex things in an easy, straightforward fashion. I am still astounded by the way they took from “structured”, schema-driven database design to modeling RDF triples, pointing out both why one might want to follow this path and how to do so in a step-by-step way. I have seen quite some explanations of what triple stores are about, and most of them are way more arcane, way worse. If this is something you suffer from, “Semantic Web Programming” is definitely up to set that right. And this is the style of things throughout the book: Concise, smart explanations of things, building upon each other, with each subchapter obviously being placed and structured in a way to be just the “logical next step” if you consciously read the chapters before. All the book seems just a continuous flow of information without any obvious breaks or loose ends. I guess this is quite a subjective point of view especially in terms of how one wants to dive into things (again), but at least to me this seems close to optimum.
Plus, there’s another good thing also massively making me remember the “Collective Intelligence” book: All along with the main issue of the book, you get plenty of chances to sharpen your (Python) coding skills, and you are subject to quick-and-dirty explanations of a bunch of technologies all along the way. No matter whether geocoding, graphviz, rdflib, networkx, dbpedia or freebase – reading this book provides one with a fair understanding of these things as a mere “by-product”, and, as I have to state, as a rather valuable by-product as most of these tools and services are just things thare are around, mostly free and ready to be used in order to solve problems, so this doesn’t just provide you with a clear understanding of what happens (theoretically) but also, again, with a set of tools to really solve these problems in day-to-day life. Ultimately, you even end up finding a framework like CherryPy described here in order to quickly serve dynamic content.
So, overally, this book doesn’t too much differ from the “Collective Intelligence” one in many ways and is same as highly recommended. No matter whether refreshing your knowledge or, I dare to say, are into learning these tricks all anew, Segaran et al will provide you with a profound and extensive introduction to the subject, leaving you both with lots of things to try yourself, with lots of ways how to apply what you learnt to real-world problems and, last but not least, eventually also will provide you with a whole load of new ideas of things that could, and possibly should, be done in your business applications. Very inspiring.
For quite a while now we’ve been using Jython, a (re-)implementation of the Python programming language that runs on top of the Java VM and along the lines of JSR-223 (Scripting for the Java Platform), in our environment to deal with some use cases which addressed way easier using a scripting language than, in example, using Java code inside a webapp. However, so far our integration left a lot to be desired, so it was about time to get this things improved somehow.
Simply put: “Programming Collective Intelligence” is one of the most outstanding publications related to IT and software development I’ve been reading in a while. Given some of our business use case, at the moment I am a little deeper into dealing with analyzing (and, subsequently) making decisions and suggestions out of data somehow linked to users in our environment (for the obvious reason of both making our work a little easier and making our users overall experience a little better), and browsing the table of content of this book made it seem worth a closer look. And, overally, after having a closer look, I was about to find out that this book indeed offers profound information on the issue I am dealing with – and way more beyond this scope…