This Blog is my (#Scalasyd)[https://github.com/scalasyd/scalasyd] talk about refactoring Scala code to be more idiomatic
It is also a slide deck for the talk given at the ScalaSyd meetup May 2018.
To switch between modes press a number as follows :
- ‘1’ -> Doc mode:
- shows the document as intended.
- ‘2’ -> Deck mode, see the slides
- ‘4’ -> Lecture Mode
- enter zooms current navigated to section
- click zooms div or block clicked
Arrow keys navigate to next or previous section or slide
Refactor
Ugly Scala Code
9th May 2018
Code Weaver
By
To present this document press 2
. Press Esc
to get back to document view. Left and Right arrow keys to navigate.
See suited.js
In this blog I’ll be talking about working with Scala to build applications.
Confession
- First I need to admit to a dirty little secret I’ve been keeping…
The Ugly Truth
- It can be easier to hack away in an imperative style
-
-
The Ugly Truth
- It can be easier to hack away in an imperative style
- Especially when hacking on a new Java API
-
The Ugly Truth
- It can be easier to hack away in an imperative style
- Especially when hacking on a new Java API
- Or noodling around to get an understanding
The Excuses?
- Thinking in Java
- looping in loops
- helps walk or build a data structure
- while I’m developing it
-
I have a bunch of justifications for this behaviour they may not be good excuses but here they are.
The Excuses?
- Thinking in Java
- looping in loops
- helps walk or build a data structure
- while I’m developing it
- Types get in the way less
The Problem?
- Sneaky Bugs can exist
- Hard to test pieces in isolation
- Hard to understand later or refactor?
-
The Problem?
- Sneaky Bugs can exist
- especially in the edge cases
- Hard to test pieces in isolation
- Hard to understand later or refactor?
-
The Problem?
- Sneaky Bugs can exist
- especially in the edge cases
- Hard to test pieces in isolation
- Hard to understand later or refactor?
- you may forget to take account of some out of view mutation
-
The Problem?
- Sneaky Bugs can exist
- especially in the edge cases
- Hard to test pieces in isolation
- Hard to understand later or refactor?
- you may forget to take account of some out of view mutation
- Gives me the creeps.
The code written in the style, that you are going to see, runs ok (I think) but I am always wary of it, because I know how hard I had to work to get it running.
Coding it was a trial and error affair, with contant compiling and running and attaching debuggers to test what was happening to my mutable placeholders.
The trick is to refactor it to a more imutable functional style as soon as it is “working”.
It is always best to refactor runnable code because we can see that the refactor caused no disernable difference to the outcome by simple running it.
Tests help of course, but the problem with testing mutable sections of code is that you have to test from the outside, as alluded to above it is hard to test an isolated fragment to see if it works.
The Solution?
- Once Running Refactor
- while you still understand the problem
- Remove mutation
- Remove loops
- Learn the API you are working in
- it may well have realy helpful methods/functions for you to use.
- Discover nice general patterns
My Example?
This code is extracted from a Crypto currency tool I wrote to help me keep track of my position on coinspot.
Unfortunatly at the time coinspot api did not allow me to get all the data from them so I needed to extract my trade history as csv,
parse it and build up a data structure to allow me to query my position and other useful info.
Using my Position data structure I can see what currencies I’m making money in and ask questions ike “how much bitcoin did I buy below this price?”
The big problem solved above is that the trade history showed prices for cross trades in terms of each other eg bought 1.2 ETH for .05 BTC but I want to know my
position in AUD for each currency, so I was forced to look to the internet for historical prices of the coins in AUD at the price of the time of the trade to work backwards.
Here you can see I have a list of previously known prices from which I build up a PriceTable
which is a Map
of PriceTableForCurrency
.
This is really a big map of maps. The caller of this function (in this case the main function) saves the newly discovered prices to a file that I read and parse on start-up to contruct my known prices List.
In short what I need to do is:-
PriceTable
- I need to build up a PriceTable for Currencies.
-
PriceTable
- I need to build up a PriceTable for Currencies.
-
PriceTable
- I need to build up a PriceTable for Currencies.
- So I can
PriceTable
- I need to build up a PriceTable for Currencies.
- So I can
Build a PriceTable
- I need to build up a PriceTable for Currencies.
- So I can
- look up a currency and
- from there look up a price a date I care about
For context you should know that to help me work around the Map of Maps in the imperative style I created a few type aliases and helper classes to make it easier to reason about, the main structures are as follows:-
PriceTable
// a type alias
type PriceTable = Map[Currency, PriceTableForCurrency]
Here PriceTableForCurrency
is just a case class
that wraps a Map
but has a helpful API that allows me to augment it and build it up from HistoricPrice
objects
which I get from parsing the CSV file of previously discovered historic prices.
PriceTableForCurrency
case class PriceTableForCurrency(currency: Currency, prices: Map[LocalDateTime,Double]) {
/**
* Will overwrite if we alredy have a price at this time
* @param historicPrice
* @return
*/
def +=(historicPrice: HistoricPrice) = {
val hp = (historicPrice.localDateTime, historicPrice.quotePrice)
this.copy(prices = prices + hp)
}
def ++=(other:PriceTableForCurrency) = if(other.currency == currency) this.copy(prices = prices ++ other.prices) else this
def asHistoricPrices: List[HistoricPrice] = {
import scala.collection.mutable.ListBuffer
val lb = ListBuffer[HistoricPrice]()
prices.foreach(tup => {
lb += HistoricPrice(tup._1,currency,tup._2)
})
lb.toList
}
}
My Example - quick re-read for context
The first part I want to look at is building up the initial PriceTable
(effectivly a Map of Maps) from the known prices list, which I read in from a CSV file.
Building a Map of Maps is something that I do a lot while coding, and it is a pain every time.
Walking a list and examining the Map to see if I have the Key in the map already and adding it and a value if I don’t have it or modifying the value if I do have it is tedious for a single Map. It is exponentially more tedious and error prone for each level of nested Map.
If only there was a better way.
Lets create a better Map of Maps
Lets create a better Map of Maps
We can contruct a Map like this:
val myMap = Map(“a” -> “little A”, “b” -> “little B”)
The -> is a function that create a Tuple
-
Lets create a better Map of Maps
We can contruct a Map like this:
val myMap = Map(“a” -> “little A”, “b” -> “little B”)
The -> is a function that create a Tuple
- and so this is like building a Map from a list of Tuples
Lets create a better Map of Maps
We can contruct a Map like this:
val myMap = Map(“a” -> “little A”, “b” -> “little B”)
The -> is a function that create a Tuple
- and so this is like building a Map from a list of Tuples
This makes sense. A Map
can be thought of as a list of Key and Value pairs.
So if I can convert my List
into a list of Key/Value pairs, ie a Tuple2
or (Key,Value)
,
I should be able to turn it into a Map.
Lets create a better Map of Maps
Lets create a better Map of Maps
Awesome. I can do it using the standard library.
Lets create a better Map of Maps
- So we can build a map from a list but what about a
Map
of Maps
?
Lets create a better Map of Maps
Lets create a better Map of Maps
Mmm. This is giving me a clue. Maybe I can refactor this.
Lets create a better Map of Maps
This is good I’m getting a raging clue now.
Lets do it - fix Map of Maps trial 1
Lets do it - fix Map of Maps trail 2
Lets do it - fixed Map of Maps
That was great. Now I have a functional tool to construct maps and maps of maps.
List.groupBy
Now fix the ugly foreach
Ok now that is done lets look at the ugle walk of the xtrades (cross trades) list using foreach.
While its better than a forloop (which could adds another possible iteration bug) the body of the foreach is convoluted.
The problem is that it is doing too many things at once.
I need to break it down into smaller problems which will not only make it easier to test and debug but will structure the code around clearer thinking.
Lets fix that ugly foreach
- Instead of trying to do 3 things
Lets fix that ugly foreach
- Instead of trying to do 3 things
- walk trades looking for a historic price miss
-
-
Lets fix that ugly foreach:
- Instead of trying to do 3 things
- walk trades looking for a historic price miss
- fetch the new historic price
-
Lets fix that ugly foreach:
- Instead of trying to do 3 things
- walk trades looking for a historic price miss
- fetch the new historic price
- add it to Map of Maps
Lets fix that ugly foreach:
- Instead of trying to do 3 things
- walk trades looking for a historic price miss
- fetch the new historic price
- add it to Map of Maps
- looking out to see if we have the key or not first
-
-
Lets fix that ugly foreach:
- Instead of trying to do 3 things
- walk trades looking for a historic price miss
- fetch the new historic price
- add it to Map of Maps
- looking out to see if we have the key or not first
- and if not adding else just adding
-
Lets fix that ugly foreach:
- Instead of trying to do 3 things
- walk trades looking for a historic price miss
- fetch the new historic price
- add it to Map of Maps
- looking out to see if we have the key or not first
- and if not adding else just adding
- again looking to see if we have the nested key etc..
Lets fix that ugly foreach - why not this?
Lets fix that ugly foreach - why not this?
- Filter the list of trades
-
-
Lets fix that ugly foreach - why not this?
- Filter the list of trades
- to create a List of trades that dont have a price
-
-
Lets fix that ugly foreach - why not this?
- Filter the list of trades
- to create a List of trades that dont have a price
- Then just get them
- I can worry about how to stuff the new prices in the
PriceTable
later
That was simple. As a side benefit, now I am not iterativley Building the lists and Maps, I can get rid of all the var
placeholder Maps
and Buffers
A close look at the the code shows that I was able to stuff the new prices into the imutable PriceTable
by folding down the new prices.
Add to imutable PriceTable
- use a fold
- have the starting accumulator be the current
PriceTable
Ok
so …?
- We’ve tidied most of it up
- by mapping and filtering and folding we dont need
var
placeholders anymore
-
so can I prematurely optimize yet?
So now the code is tidy and manageble, is it time for me to speed up that slow price fetch from the internet?
so can I prematurely optimize yet?
so can I prematurely optimize yet?
- I will use
scalaz
Task
- to wrap the get historic prices function
- and return a List[Task[PriceTableForCurrency]]
- rather than a List[PriceTableForCurrency]
- then gather all the tasks in parallel
- to give me back a List[PriceTableForCurrency]
Lets do it
$ git checkout start_5
$ # clear out the existing prefetched prices
$ cat /dev/null > $HOME/uglycode/coinspotpos/historicPrices.csv
$ sbt clean run
So lets have a look at how much tidier the code is now.
Ugly?
Pretty?
Nice.