Micro-experimentation Tools in Java 9

I jumped over to IntelliJ two years ago, and I've been really happy with my choice. However, there is one thing that really irritates me.

Reindexing.

We have a big codebase at my workplace, and when IntelliJ decides to reindex, it can take a *while*. In the worst case, my IDE is unavailable for 10 minutes, though where it is most painful is the common case, which is about 30-45 seconds. 30-45 seconds is the perfect time to distract me and break my concentration--I check slack, my email, check back, get caught up in a customer issue, and by the time I come back, I've forgotten what I was working on and need to spend more time remembering where I was at!


Anything that breaks flow is a frustration, and there are two flow-breakers that I've been thinking about as I've been playing around with the new Java 9 release. The first is one that we all know and love: Java's verbosity. The other is one that you might not have consciously run into yet: JVM optimizations.

JShell

Have you ever tried to do a quick experiment with a new Java library just to see how it works? Creating classes and methods and even variables can get cumbersome when you are just in exploratory mode. As of Java 9, Java has finally joined the ranks of programming languages with a REPL. Sweet!

Working in a REPL is so refreshing because I can simply call a method and see what it does. Using that I can learn about cool new features in Java 9, like how about the fact that I can finally create a map and its contents on a single line? Genius!


jshell> Map.of("Why", "did", "this", "take", "so", "long?");
$1 ==> {Why=did, so=long?, this=take}

jshell> Map.ofEntries(
   ...>   Map.entry("verbose_languages", Arrays.asList("Java")),
   ...>   Map.entry("terse_languages", Arrays.asList("Scala")));
$2 ==> {terse_languages=[Scala], verbose_languages=[Java]}

In JShell, I can play with this to my heart's content without needed to create a file, create a class, create a main method, compile, run, compile, run, compile, run, and then delete the file.

JMH

How about trying to find out which algorithm or library is faster/better for your use case? If your algorithm executes in the microsecond range (or less), JVM optimizations start turning into noise, making it more difficult to make a scientific assessment.

I was very surprised by the outcomes Julian Ponge explained in his article about the trouble with writing benchmarks in Java. Here is a fun experiment to try. Here are three implementations of an algorithm:


private static double mySqrt(double what) {
   return Math.exp(factor * Math.log(what));
}

private static double javaSqrt(double what) {
 return Math.sqrt(what);
}

private static double constant(double what) {
 return what;
}

Create a simple benchmark that runs all three of these in series, comparing their performance by snapping time at the beginning and ending of each test.

If you want, you can use mine:

https://github.com/jzheaux/micro-experimentation/blob/master/04-jmh/cat-genealogy/src/main/java/com/joshcummings/cats/LoftyBenchmark.java

Crazy, but true, Java 8 and earlier will show the silly square root to be the fastest and constant to be the slowest! (You still see it in Java 9, too, but the behavior is less pronounced.)

Learn More


More about each of these can be found in my latest Pluralsight video: Micro-experimentation Tools in Java 9. I'd love your feedback!

Curious JMH results between Java 8 and Java 9

I've recently been playing around with JMH and doing some comparisons between Java 8 and Java 9. I wrote the following toy benchmark, learning from the example that Julian Ponge wrote up in his article Avoiding Benchmarking Pitfalls on the JVM. This is my simple attempt to apply the principle:



import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Scope;

@State(Scope.Benchmark)
@Fork(1)
public class BenchmarkComparison {
  public double sqrt(double what) {
    return Math.exp(0.5 * Math.log(what));
  }

  private double what = 10d;

  @Benchmark
  public double baseline() {
    return Math.sqrt(what);
  }

  @Benchmark    
  public double correct() {
    return sqrt(what);
  }

  @Benchmark
  public double constantFolding() {
    return sqrt(10d);
  }

  @Benchmark
  public void deadCodeElimination() {
    sqrt(what);
  }

  @Benchmark
  public void deadCodeAndFolding() {
    sqrt(10d);
  }
}


Julian's post is intended to demonstrate common pitfalls that Java engineers fall into when it comes to benchmarking, with three of the methods above indicating incorrect ways to create a benchmark. I invite you to read his informative post to get more background information, if you like.

Running the following JMH benchmark in Java 8, I get the following results:


And here are the results in Java 9 on the same machine:


While this is a great example for why benchmarks need to be run on consistent JVM versions, what interests me more is why are the results in Java 9 are so much "smoother"? Why are they even the same order of magnitude?

I don't have the example handy, but I had a similar experience with Julian's very first experiment, with running several benchmarks in the same JVM run, which is a "no no". In Java 8, I saw the same behavior as Julian, but in Java 9, I didn't until I added a third test to the benchmark. If I only added two, I didn't see the dramatic performance degredation.

Any ideas?


Published Author! Checkout Scaling Java Applications Through Concurrency

I've very excited to announce that my first Pluralsight course has just been published! You can check it out Scaling Java Appliciations Through Concurrency:

https://app.pluralsight.com/library/courses/scaling-java-applications-through-concurrency

If you happen to have a Pluralsight membership, I would love to get your feedback!

Here is the course description from the website:

"There are several gems inside the existing concurrency API that have been hiding in the background for years, waiting to be discovered by curious software engineers. The existing Java Concurrency API makes it much easier to build a Java application that is scalable and performant without having to settle for lots of low-level wait-notify usage or lots of locking using the synchronized keyword. In this course, Scaling Java Applications Through Concurrency, you'll cover several concurrency patterns simplified by the Java Concurrency API; these patterns will make scaling new and existing Java applications simpler than ever. First, you'll learn about how the Java Concurrency API has changed scalability and how to run processes in the background. Next, you'll cover classes that will help you avoid mistakes like lost updates when sharing resources. Finally, you'll discover how to coordinate dependent processes and implementing throttling. By the end of this course, you will be able to easily scale your Java applications through concurrency so that they work better and faster."

I'd like to give a special thanks to Brian Goetz and his book Concurrency In Practice as well as the collective knowledge in online blogs and, yes, StackOverflow. I feel like I learned so much producing the course, and I hope that you get as much out of it as I did.

DVWA 1.9: File Inclusion Medium and High

Although I've studied and practiced secure coding standards for some time now, I had yet to try my hand at the offensive approach before last Friday when I downloaded DVWA and started working on the exercises.

File Inclusion

The file inclusion exercises were unexpectedly eye opening. Initially, I thought: "Directory traversal, get the etc/passwd file, etc., etc., not much here I don't already know." Then, I stumbled into Ashfaq Ansari's walkthrough of File Inclusion and Log Poisoning on DVWA Low which showed to my astonishment how one could use this security hole to poison logs and subsequently upload a php shell to the DVWA server.

Clever. Not bad for a day's work, right?

Medium Level

Thanks to Mr. Ansari, I learned a lot more than I thought I would about the dangers of file inclusion security holes; however, there was more to come. On the medium level, the same directory traversal attack initially seems defended against with the following code:

$file = str_replace( array("http://", "https://"), "", $file);
$file = str_replace( array("../", "..\""), "", $file);

Now, the url parameter value "../../../../../etc/passwd" will instead be transformed into etc/passwd and nothing will show:


Blacklisting is hard, though, and a single-pass search and replace cannot remove all ills. Consider, for example, what would happen when performing a str_replace on "hthttp://tp://". You, of course, would be left with "http://", the thing you were trying to prevent from being in the string in the first place!

So, of course, if all one is going to do is remove the "../" instances from the string, we simply need to construct a string that will leave "../" instances in the wake of a search and replace, e.g. "....//....//....//....//....//etc/passwd" or "..././..././..././..././..././etc/passwd" will both do fine.


Now, the same steps of log poisoning and shell uploading can again be performed with relative ease.

The right way to defend against this is whitelisting, which the higher levels of this exercise employ.

High Level

Actually, I'm not certain quite how to leverage this, yet, but I thought I'd post some of my initial thoughts. The defense against file inclusion in the high level is incomplete because unintended patterns can get passed it:

if ( !fnmatch("file*", $file) || $file != "include.php" ) {
    echo "ERROR: File not found!";
    exit;
}

Here, the regex allows for the file protocol, e.g. page=file:///etc/passwd. Since this would simply serve files from the user's local machine, I'm not sure what could be done with it, but I found it interesting.

A broken CompletableFuture invocation with an unexpected fix

During a recent training, I was demonstrating the class CompletableFuture and how it could be used to create non-blocking methods like this:


    public static void <T> persist(T entity, Consumer<T> andThen) {
        return CompletableFuture.supplyAsync(() -> {
            System.out.println("Persisting entity...");
            return entity;
        }).thenAccept(andThen);
    }

    public static void main(String[] args) {
        persist("toPersist", System.out::println);
        System.out.println("Done!");
    }

I wrote a method like the one above and ran the application. The output, unfortunately, was only:


    Done!

Where did the "Persisting entity..." output go?

(P.S.: The absolute worst feeling ever is when you create an example during a training, and it doesn't work! I definitely should have had a precreated example, but that is beside the point. =])

Since it can be hard to think when there are 30 people watching you, I did the first thing that came to mind. I created an ExecutorService and handed that reference into the thread pool. Now the code says:


    private static final ExecutorService pool = Executors.newCachedThreadPool();

    public void <T> persist(T entity, Consumer<T> andThen) {
        return CompletableFuture.supplyAsync(() -> {
            System.out.println("Persisting entity...");
            return entity;
        }, pool).thenAccept(andThen);
    }

And it worked! It was a bit of a shock, but I was pleased that I could fix it and move on to other topics. :)

After class, though, it gnawed at me, and I was excited when I finally got a minute to sit down and poke at it.

After peeling away a couple of layers, I found that I could reproduce the problem in the following way:


public static void main(String[] args) {
    ExecutorService pool = new ForkJoinPool(1); //Executors.newCachedThreadPool();

    pool.execute(() -> System.out.println("Done"));

    pool.shutdown();
}

This example will only print out "Done" occasionally. However, if I change it to:


public static void main(String[] args) {
    ExecutorService pool = Executors.newCachedThreadPool();

    pool.execute(() -> System.out.println("Done"));

    pool.shutdown();
}

It will work every time.

What's going on? Long story short, I posted my Fork/Join vs ThreadPoolExecutor question to Stack Overflow and John Vint gave the simple answer that all threads in the Fork Join pool are Daemon threads, which means that the VM will stop running without waiting for them to complete. The ThreadPoolExecutor creates non-deamon threads which causes the runtime to wait until they are done.

What does this mean for the in-class example I gave? If you need the runtime to hang on while your CompletableFuture is finishing, pass an Executors-obtained thread pool as a second parameter since by default it uses the Fork/Join common pool.

An Advanced Java Readlng List

Recently, I finished doing a 32-hour training over 8 days for a group for 30 Java professionals. We covered several topics, and I published to them a list of further reading that has influenced me as a developer over the years. Perhaps you will find value in the same list I shared with them:


  1. Head First Design Patterns - My co-workers and I studied this book a chapter a week together over lunch back in 2007. Decorators, Observers, and Strategies completely changed my perspective on how to develop code prepared for change.

  2. Effective Java (2nd Edition) - After having developed in Java for 14 years, I finally picked up Josh Bloch's book and after the first chapter I was sorry I had waited so long. It explained and validated many of my long held practices as well as introduced me to additional good ones, like the self-documenting power of public static final methods as named constructors.

  3. Java SE8 for Programmers (3rd Edition) (Deitel Developer Series) - Java 8 is the coolest Java release since Java 5. Method references and lambdas immediately changed the way I code for the better. In addition, java.time is the Date/Time API we've always wanted. Deitel's book is great at exposing all the great new features that will change the way you code in Java.

  4. Starting Out with Java: Early Objects (4th Edition) (Gaddis Series) - Alternatively, there were some attendees who wanted a good foundational Java book. I've used this book when teaching intro Java classes, and I like it! Especially, I like the early introduction to object-oriented programming.

  5. Java Concurrency in Practice - This is another eye-opening book that I waited far too long to read. Atomicity I understood, but I had never considered the ideas of visibility or re-ordering. And I was completely helped out by his explanation for what to do with InterruptedException.

  6. Java Performance: The Definitive Guide

  7. Iron-Clad Java: Building Secure Web Applications - I included this one because security is often tacked on after running some vulnerability assessment. While I might agree that one should "optimize after", security should be built in from the start.

  8. Mastering JavaServer Faces 2.2 - To be honest, I prefer an action-based framework like Spring MVC, but JSF 2.2 caught my eye, especially with the new HTML5-friendly JSF attribute syntax. More than that, this group of engineers asked for training on it, and this is a great book to get a more comprehensive view. :)

  9. REST in Practice: Hypermedia and Systems Architecture - This book is not quite as practical as the rest of them, but I really liked the theory outlined here, mixed with code examples. Through this book, I better understood the gaps that exist between existing libraries and what ReST specifies.

  10. Pro JPA 2: Mastering the Java(TM) Persistence API (Expert's Voice in Java Technology)

  11. Java Message Service - This book helped me understand an API that had to that point had seemed so inaccessible to me. That and Spring Boot made it super-easy! :)

Enjoy!

Lemon Squash and 20 Minutes of Coding


HTML-encoded Lemon Squash
A couple of weeks ago, I decided I would try an experiment with my two oldest children and my wife sort of similar to the time I planted a Lemon Squash in our backyard:

Have we tried it before? No.
Do we know if we'll like it? No.
Do we know if it will even grow in our climate zone? No.

Sounds like a winner!

So hear goes... having kids learn along with non-coder Mom is like growing lemon squash in the garden:

Lemon Squash is Hearty


We learned very quickly that lemon squash didn't require a lot of maintenance, wasn't vulnerable to squash bugs like all our other squash plants, and basically grew even with us often forgetting to water it. For a family of six kids, that's a big plus. :)

Likewise, having the boys learn HTML using codecademy.com was a very self-directed process. Compared to the logical dead ends that my students at Neumont will run into with if statements and for loops, my boys, in the world of HTML, were relatively impervious to bugs. The closest they got to a head scratcher was the following:


  ...
  <h3>Seven Things I Like To Do
  <p>Play the piano</p>
  <p>Read books</p>
  ...

(This was made trickier because codecademy said that the solution was correct.) In HTML, all tags must be closed when we are done with them, much like turning off the bathroom light when you are done with it (this analogy worked for my older, more rules-conscious son). When the above is rendered by a browser, all three lines are in bold:

Seven Things I Like To Do
Play the piano
Read books


 Of course, my boys don't know that this was a bad thing and neither does my wife. They didn't know what it is supposed to look like in the end.

However, when my wife checked it over, she noticed the error based on the detailed instructions and explained the concept as she understood it to the boys. The boys fixed the error, and it then looked like this:


Seven Things I Like To Do

Play the piano
Read books

At that point, my wife said "Oh, so that's why it was all bold: The browser thought the paragraphs were part of the header." Bingo!

Small bugs like these quickly became easy for my boys and my wife to squash (see what I did there?) repeatedly until the entire species retreated into extinction.

Further, other than tag issues, my boys needed very little direction from Mom. Without any parental instruction, my younger son began shouting "DOCTYPE!"
at the beginning of each exercise since he knew it was required no matter what, which filled me with the same amount of pride as the first time he recited the "Inigo Montoya" line from the Princess Bride by heart.

In the end, they were able to get through the first 15 exercises with only minimal correction from my wife. I personally never once corrected the kids nor my wife. Sometimes my wife would ask me a theoretical question or two after the kids went to bed, though. (By the way, those conversations where fun. It was the first time that I can remember my wife showing a technical interest in what I do for a living.)

While hearty, the process wasn't perfect. There was some churn around the fact that codecademy would often tell the boys they were right when actually there were some (what I would consider) important syntax bugs in the HTML they produced. Perhaps the software was built to be lenient. That said, my wife caught the issues and was cognitive enough to know the difference between following the instructions and just getting the software to say "correct!".

Lemon Squash requires water, soil, and sunlight just like every other plant


My wife and I, it turns out, didn't really need to know much about photosynthesis to effectively grow good lemon squash. We just needed to do simple, understandable things like plop the plant in soil that gets sun and pour some water on it. Still, those were necessary elements of making it grow.

Likewise, my wife didn't have to have a lot of formal training, be a genius, or do anything more than use the skills she already had to be there for her boys.

You don't need to be a genius to help kids learn to code.
To be truthful, my wife is a smart cookie, so I won't diminish how her university education may have been brought to bear so that she could assist our boys with such ease. The truth is, though, that codecademy did most of the work. They would read and follow the instructions and the software would do an okay job of guiding them when they misunderstood and did the wrong thing. Thereafter, my wife would "water" by checking things over and helping them make the corrections that the software incorrectly skipped over.

The flow that worked the best was to water regularly. Kristi would hang out
You could do this, right?
while the boys completed the exercises, and she would check each exercise the moment the boys finished. This was a bit of an investment in the beginning; however, after the boys understood the start-and-end-tag thing and a little bit about nesting tags, they were on their way. The first couple of times, Kristi did not do this, resolving to check all their work in the end. This ended up taking much longer because mistakes made on the first exercise would perpetuate through the next five or six. This way, they would practice the wrong way for 20 minutes and then need to be retaught by my wife (which required her to go back through each exercise on her own, doubling the time it took to make progress). This created frustration for all three. Once they changed to the more hygienic practice of checking after each one, it only took a minute or two cumulatively; they boys made more progress and Kristi was less frazzled.

Lemon Squash doesn't taste good to everyone


In the end, I was the only one who would eat the Lemon Squash. I never really understood that because I thought they were excellent. Of course, my children "prefer" only that which is suger-laced, so that could be part of it.

I asked my boys what they thought, and they said it was "fun".
<p>I like trains.<p>
They liked putting silly phrases in the header and paragraph tags and making lists of hobbies, TV shows they liked, and actors they found annoying. They liked adding images to the page and linking to their favorite sites.

I asked my wife and she said, "No, I'm just too busy." For her, squeezing out 5 20-minute sessions where she directed the learning environment was tantamount to me asking her to pull all our children from the public school system and prepare them for college by herself. It made me admire her support all the more, but it also helped me to see that not every parent will be able to carve 20 minutes consistently out of their day to sit and code with their child.

Conclusions


I'm interested enough in this that I'm going to try and get a few more non-coder parents to try it out. The coding world is less scary than it used to be and non-coders can teach other non-coders to code using tools like codecademy. Anyone interested?

It was important for an older person with more training on paying attention to detail to be there with my boys to make sure they stayed on target With a small initial investment, they became quickly self-directing for 90% of the time. It would be interesting to see a button like "Have a Live Volunteer Look At Your Code And Give You Tips" on codecademy or some other site so that Mom or Dad can have a bit more flexibility.

There were a couple of times when the software was incapable of noticing that the syntax was incorrect. Software has bugs, and that's okay; a way to still be helpful while bugs are being fixed might be to have an example picture of what the solution ought to look like in the end so students have a visual way to spot check their solution against the instructions.

Twenty minutes a day was good, though I think that it would eventually need to turn into more. The exercises are written to allow someone to learn piecemeal, but as the concepts get trickier and more abstract, it may take an hour or two of practice before a student feels confident in her understanding. This kind of time commitment would likely be when the student really decided to invest himself in non-trivial coding (like going from picture books to chapter books when learning to read).

Non-coder parents should try this out! My wife did it, and so can you!