Last week, the Java part of our system went productive after a major runtime update – and it did so not on top of the Glassfish application server we’ve been using so far but rather re-structured into multiple modules embedding a current version of Eclipse Jetty. This is a fairly large change and quite a step, still sort of a work in progress and, after all, once again something worth writing a bit more about…
Stage 1: Loads of Tomcats
In 2004, we started out using external tools outside our legacy document management system in order to do things the DMS and its embedded development tooling wasn’t capable of doing in an acceptable way. I have been ranting about these things prety often around here so I won’t bother going into details about that – in the end, it suffices to say in 2004 technical decision was to not stop anymore where tooling ends but rather try choosing the tool that allows for providing a good technical solution to business problems, too. In several ways, this approach of trying to overcome boundaries of given platforms, choose good solutions (mostly from open-source environment) and integrate or build upon them is something that has proven to be a good thing more than once, and something that has become an important part of our day-to-day operations.
Our first external tool implementations were done using Perl, until we eventually switched to Java. Initially this happened after being required to build a client to a remote SOAP web service using DIME attachments for large file transfer: We simply hit a wall with the Perl tools at hand and figured out that solving the same problem in Java on top of Apache Axis took less than a week – including learning curve for working with the development and build environment which is anything but trivial using Java, Java EE web stack and an IDE such as Eclipse.
Subsequently we chose to do more using Java and, at some point, got a running environment consisting of several Java EE web applications running in Apache Tomcat servlet containers, embedding Spring for wiring and infrastructure (which I really used to love these days back in its early 1.2 version). The architecture tried to resemble the server structure exposed by the DMS simply because, by then, we thought it would be a good idea to not introduce a whole new and different structure for the Java system outside the DMS. So, we were left with…
- … an application providing a web interface for external customers, running inside a Tomcat container in a demilizarized zone,
- … several backend modules providing various services on top of the DMS SQL database and document file system, running on two Tomcat installations in the core production LAN,
- … communication between these structures built on top of Spring Remoting, Hessian and HTTP, supposed to do that sort of remoting transparently without having to hassle much with serialization, deserialization and transports.
This structure has been in use for roughly two years and, generally, worked well. From a runtime point of view, we never really hit a wall (leaving aside continuous learning curve in administering Tomcat, in optimizing the build process – in example by replacing an all-internal Eclipse build / dependency management with maven and things like that). However, during these two years, we gained experiences with the platform and also learnt to see some of the shortcomings of our architecture.
First and foremost, the idea to get all the things into separated application modules was caused by the desire to have different logical and loosely coupled parts of the system that might be developed, operated and, built, deployed without being forced to bring down and restart the whole application all the time. This definitely was and is a meaningful idea. What we got, however, was a hurd of Tomcat servers that needed care, feeding, administration and updates. At this time, affordable and performant server virtualization wasn’t something on our list, so we operated Tomcats on a bunch of physical machines, and then and now experienced the structure lock up or go down simply because one of these machines failed in this way or the other (which happened, then and now).
The other thing which, looking back, is blatantly obvious: The applications were everything but loosely coupled – way worse: We ended up with a bunch of Java / Spring web application modules communicating each other via HTTP and Hessian, and the very moment one of these modules was redeployed or even restarted, everything else would fail in a particularly obscure way at least for the time the one module being down. Still obvious yet way worse: Doing Hessian and Spring Remoting, you end up with a bunch of API classes that are part of each of these applications, and the very moment one of these API classes changes, you end up with redeploying all of your modules anyway in order to avoid the usual exceptions you run into in case of class version conflicts.
Stage 2: The Big Server
So, at some point, given that operation got more and more fragile and development / deployment in this infrastructure considerably slowed down, we tried to find solution to compensate for that. The first and possibly most obvious idea: Let’s ditch all the hurd of servers and dump all the applications into one big container. Right then, in 2007, Glassfish v2 was just released. At this time we already had done some peeking into what Java EE technology has to offer, found many of these things interesting possible solutions to our problems and, consequently, decided to move all the applications together into one Glassfish domain. To cut things short, this happened in less than two weeks. Our applications didn’t require much change for that initially, and most of the time actually was spent on runtime aspects such as correctly getting the Glassfish configured for our environment, including the Apache reverse proxy / front controller, the database and filesystem backend and so on.
In this initial version of The Big Server, we ended up with a large Glassfish domain running on decently sized hardware, containing a bunch of web applications still, however, communicating with each other through HTTP remoting. We solved the problem of applications failing out at runtime because we established a semi-formal (shell-script driven) deployment process to build and deploy all the web applications at once. We solved the problem of overall system fragility by moving the Glassfish to a strong, well-supported, redundant server machine ready to bear that load. Glassfish community, even though I still consider it somewhat small, was very friendly and kind helping through some basic problems, and overally, we got a way better and way more stable runtime pretty soon – actually, a runtime that used to be in operation until last week. 😉
This way, however, development and build was left annoying in some ways. Building a dozen of web applications and deploying them to an application server configured inside a running Eclipse IDE ain’t always fun, especially taking Eclipse maven tooling into consideration, which was somewhat flaky (to say the very least) in 2007 and still in some ways is, to day. As a matter of fact, despite various other nuissances I have been using NetBeans IDE pretty much in the meantime, simply using it for these purposes seemed way more usable and straightforward. Anyway, all along the way the process was „optimized“ build/development-wise, too: With the need of a distributed deployment of all these modules essentially eliminated, there seemed to be no particular reason anymore to have many different web applications – what’s the point in doing „local HTTP remoting“ between services on the same machine, anyway? 😉 As an outcome, at some point all of the services, implementation, web application modules were merged into one large web application running in the Glassfish „/“ context, which was little more than copying all these things into one project, merging the different deployment descriptors and Spring configuration files, discarding redundant things (such as the Spring Remoting layer) and making this thing build and deploy well again.
Again, this didn’t take too much time and mostly happened all along with everything else happening day to day. Initial idea, at this point, was to make this large application inside Glassfish (from an architecture point of way obviously a set-back) the starting point for technical refactoring towards an actual Java EE application, making more use of more than just Java EE web tier. Looking back, however: We never made it – for a bunch of different reasons. Learning that Spring is one ____ of a beast if you want to have it refactored out of your stack is one of these. Learning that, with our DMS database structure (which, query-wise, only works out „somehow well“ if tweaking SQL and being pretty careful about what you SELECT), JPA and O/R mapping is neither fun nor trivial anymore was another one. Learning that the implementations of the Java EE frameworks at hand of course come with certain issues and runtime problems of their own (like, later on, JAX-RS / Jersey not being capable to „find“ its EJB support after redeploying an application and, thus, ceasing to correctly work unless the whole container is being restarted) was yet another one. In the end, we ended up running that structure on top of Glassfish for roughly seven years, and most of the time it worked well without just too much ado.
So why bother actually reconsidering that structure again? Well, all along these seven years, we of course also dealt with various problems arising from both out of the Java EE stack, our Glassfish runtime and various tooling aspects related to it. First off, using Eclipse, maven Tooling (m2e), m2e-wtp and Glassfish is a source of re-occurring frustration. So far I spent (wasted?) countless nights trying to figure out how to set these tings right to make sure Eclipse does build and deploy the right things at the right time to the application server running inside the IDE. At the moment, I still am unsure whether there is an out-of-the-box way to achieve this that doesn’t continuously get into your way and just works.
Then, second thing is that maintaining the runtime has become a bit more challenging with more developers involved. With multiple instances of Glassfish – local development system, testbed system, production hosts – you need to think about managing „local“ configurations for the running application, you need to consider deployment of applications as well as platform updates itself for „core“ Java, Glassfish and some of the frameworks used in there. There are good ways to get these things done, and I am sure these can be optimized digging deeper into Glassfish as a platform, but they, too, provide a rather steep learning curve for developers. There always is a large load of infrastructure one new to the system has to understand and at least to some point to master until being able to work well with things – with, by now, three (or rather two-and-a-half) people doing Java, we’re too small to consider the usual developer / maintainer roles Java EE is thinking in.
And, last but definitely not least: After giving up our attempt to refactoring the application towards Java EE modules, we ended up dealing with a somewhat large, memory-consuming runtime (providing a load of features we don’t use) effectively used in order to host one large, chunky Java EE web application embedding Spring. The more time passed, the more this seemed just blatantly overhead. A bunch of other minor issues were along the road: Oracle apparently axing commercial support for Glassfish, in example. This ain’t too big a deal for us as we haven’t been using commercial support anyway, but given Glassfish community always seemed smaller than communities found, say, around JBoss or some Apache projects, this might become an issue in the future.
Stage 3a: Little REST, thinking of …
Overally, ways seemed required to get rid of this mess. Ideally, a „future platform“ should…
- … allow for easy maintaineance (including build, testing, redeployment, launching of „testing instances“) of separated modules by different developers. This was one of the original intentions of our structure – and now this is sort of what we need.
- … depend upon just as much infrastructure as strictly required. Ideally, each module should include everything – configuration, libraries, start/stop scripts, … – required to just unzip and run the application on any machine that has a JDK or JVM around.
- … run fast and straightforward from within an IDE, too, ideally minimizing local build-deploy-run cycles. Though a while ago we spent money on JRebel to ease this process, it still is a clumsy and somewhat difficult process.
Comparing to 2004, we now have smarter ideas how to cut our application into several isolated modules that can run side-by-side without depending upon each other all too much. Talking building those applications in Java, I had a really inspiring moment a while ago while working on a RESTful endpoint in code for a research project. Most of our development, in this, happened inside a web application using Jersey in a Glassfish v3. Just mostly out of curiosity, during the end of these implementations, I spent a bit of spare time on rewriting that very service using the Spark framework which offers lightweight RESTful development on top of an embedded Jetty container. Cutting things short here: Without knowing the framework, rewriting that REST endpoint code took me less than two evenings and, adding a bit of maven assembly and a few pieces of configuration „magic“, ended up in right what I needed – not a web application
war container anymore, but a self-contained
restserver-0.0.1-SNAPSHOT-bin.zip I could hand around, unzip anywhere, run it anywhere… easily also run two or three or a bunch of different instances of that very application on the same host using different ports and different configurations… All these things are possible, too, with Glassfish domains or application servers but require quite a bit more effort to be done right. And, looking at the overall architecture of our system, it let me look at things from a different angle: What about reducing complexity simply by cutting applications to small, digestable, replaceable pieces exposing standardized interfaces?
Obviously, the first thing down such ideas is doing your homework evaluating whether or not you’re alone with this idea. Doing some searches, I stumbled across the idea of micro service architecture (also see these slides and those slides, too to get an idea) – and liked what I saw as it comes pretty close to my desired structure. Most of these writings outlined very well that this approach comes at a price and has substantial disadvantages, too (of course), but generally, knowing there are similar people with similar ideas just around the corner does help doing more thinking about such approaches…
Stage 3ab Little REST, working on, looking ahead…
The remainder of this is rather short a story: Yeah right, so let’s get started. There’s, however, some substantial questions…. First off, upon which foundation should such an environment eventually be built? Surely, at least talking Java, there is Spark, and there are two or three other similar frameworks, but each of these seem just infrequently updated and/or just moderately active at all. It didn’t take us long to consider Jetty, which most of these frameworks build upon, too. Basically, being an Eclipse project and pretty mature, it seems that, comparing all these alternatives, Jetty is the one most likely to still be around in a couple of years. Then, making an application completely „micro-service“‚ish is work – and likely to be work enough to end up just the same way our Java EE refactoring approaches tended to end up. Yet, to get it started, some sort of migration path is required. We’ve been pretty pragmatic about that… getting our current large web application to run on top of Jetty instead of Glassfish, again, was just a matter of days (including introducing a bit of boilerplate logic for database connectivity and the few things handled by the container before). So what’s that state now?
- The customer-facing part of our site now runs on top of that runtime in a Jetty launcher containing basically the slightly modified version of the „old“ web application. This is not much of a problem and not likely to change as, side by side, we’re into deploying a new web frontend also built as a custom web application yet being a different (new code base) so we will not be supposed to do much maintaineance of the „old“ application.
- The part of the system used by our internal users so far is a branch of the old web application, drastically stripped, having all unnecessary code (web frontend stuff, in example) and dependencies removed, running on a different host in a different Jetty launcher. This is the codebase that, already by now, is being „reworked“ in terms of either refactoring, code cleanup or re-implementation of parts of the functionality – the latter happening in custom services.
- With the current code base, we already have a pretty clear idea which services we will ultimately have to end up and how their interfaces will have to look like, more or less. Our new external frontend code already is qualified and tailored to being able to work with this kind of backend services.
After all, things are well and all that will remain is just work? Looking at that history: I doubt so. 🙂 So far, I see a bunch of extremely interesting aspects in that approach, both talking „standalone“ services instead of web applications inside a large container and talking about „micro services“ per its very idea.
- One side, it seems a good thing as it breaks down code to smaller chunks that seem easier to test, to handle, to rewrite. Indeed, in our current environment (and, here, in the legacy DMS and in our „older“ Java code), code complexity is an issue. There are large methods, large classes, large function blocks and few people actually know how they fit together. There are few automated test cases, but mostly, handling this code is tricky. From that point of view, implementing micro services seems a good thing: Come up with a simple RESTful endpoint, declare a „contract“ (in terms of endpoints/routes, HTTP verbs usable on these and data structures returned by them). Write test cases (in example using rest-assured) that validates whether a service implements these contracts. And then, implement the service until the test is „happy“. That’s not new – actually, this is how work possibly should be done. And yet: These test cases, in example using an embedded Glassfish, do not necessarily expose errors that happen on „production“ infrastructure. And these tests, likewise, do not prevent developers from writing unmaintainable code. Talking micro services and, ideally, running exactly the same code running in a production environment (including required runtime), these test cases seem to get a bit closer to testing the code that actually runs in production. And enforcing service implementations to be „small“ (I read people talking about 10 .. 100 LOC here, although I doubt this is really doable…), there is little room for writing code no one can read five months from now.
- Talking about interface „contracts“ and tests all along with mostly lightweight implementations, you’re not that much tied to a particular language anymore. Most languages out there provide support for basic HTTP servers, and most languages also provide frameworks similar to Spark – just consider Sinatra (Ruby), Flask (Python) or mojolicious (Perl – this is what our „relatives“ in Berlin currently are adopting). There might be various reasons for doing this – in our local environment, there are quite some situations in which we depend upon software solely running on Windows platforms where most of our production Java code runs on Linux VMs. Not being tied to a particular implementation language makes it easy to, in example, throw in a simple C#/.NET service doing one specialized thing without making your world too much more difficult (apart from a few fundamental aspects, see below…). If doing a rewrite using a different language – just use the tests you hopefully wrote and make them work again with your „new“ implementation.
- In some way, this seems to implement a focus on limiting oneself to things required. In a Java / maven world, one is pretty fast just adding a few new dependencies to a project whenever needed. In our case, the web app incorporates Spring and a bunch of its dependencies, and a bunch of layers on top of it (which are code of our own). In the end, you’re moving around loads of third-party code that can get into your way if it doesn’t work well, and that makes you depend upon it for a period of time hard to tell – trying to get some special Spring classes refactored out of our structure is quite a piece of work. If you want do go „micro-service“, you will possibly give up on a lot of these things as (a) you don’t want to make your service unnecessarily heavy and (b) you don’t need Spring to manage 100 LOC. 😉 Plus, you end up with, eventually, runtime dependencies manageable a bit easier. Just talking, in example, Spark: Compared to Spring anno 2014, the framework is extremely small – a maven project including a bunch of classes. I hope not to be required to mess with it, but ultimately, it would not be all too difficult.
But: Simply put, I think going down such a road will be a massive load of work, and it will be a process to make us see interesting side effects and issues all along the way, some already on our radar, some likely to show up as things grow . All the things pointed out above surely do not come for free – just vaguely talking about versioning in service interfaces, operation and monitoring infrastructure for making sure the system is reliably „available“ and doing the right things, integration testing of the whole load of services, backup/recovery scenarios and way more. Solving some of these issues will be more fun than solving others, and maybe some way down the road we will discover this architecture needs minor or major adjustments, too. Possibly in the end you’re off rather well assuming that, no matter which architectural approach you are choosing, you will not drastically reduce the complexity of the solution as this will mostly be predetermined by the complexity of your problem to solve; you just will reduce (at best) the complexity and amount of code written by you – and eventually replace it with, in example, complex communication between distributed modules.
I am curious to see where this is heading. Generally, I appreciate the idea of Java EE for being a matured, well-supported set of architectural approaches, technologies, conceptions. Yet I wonder whether there are things that also might be resolvable well outside the scope of pure, full-stack Java EE – reading through, in example, the amazon.com architectural description on highscalability.com which has come to some age now, I saw them back then using Java but „just servlets“, not the rest of the Java EE staack, and most of the approaches even in micro services basically break down to using servlets which are part of the Java EE web tier. Right now, I am curious to see what will happen in this direction…
… and, at times, I really wonder whether it would be worth spending time (again) trying to get together a meetup of people working with certain architectural and conceptual approaches regardless of which language or runtime platform they are using. The ideas gained from that, talking micro services, will eventually be the same all over.