Share

engineering

8min read

Maintaining Open-Source RxAndroidBle Library (Handling and Handover)

Maintaining Open-Source RxAndroidBle Library (Handling and Handover)

PART 1—User Handling

Picking your battles

As a developer, while creating a library, you should take into consideration that the users do not have all of the knowledge about how your code or the system underneath works. You have to be prepared for that and consider how it can be managed, beforehand. Usually, there are two options.

Defensive programming

Failing gracefully, as it’s sometimes called, is a nice solution—especially from the user’s point of view. If users try to do something that will fail due to lack of knowledge, you can always try to guide them to get things done anyway.

I.e. Android BLE API allows for only one operation being executed at any given moment and trying to run another one will cause it to fail immediately. But this case can be easily managed by adding a command queue and executing them one by one. The user won’t see any hiccups.

Another solution could be starting an Android BLE scan which may fail due to internal Android issues—you can potentially work around it by starting a classic Bluetooth scan and filtering out devices that are not Low Energy. Depending on what the user is interested in (because classic Bluetooth scan doesn’t contain all the information available through the BLE scan)—this workaround can turn out to be just fine. Defensive programming is not a silver bullet though. There are situations in which the managed approach may lead to subtle nuances in how the program works and eventually even to bugs that are difficult to investigate. In fact, we have also encountered one of such problems ourselves, namely—by establishing a BLE connection. BLE specification says that a single device may only sustain a single connection, but the users who are unaware of that could try to call RxBleDevice.establishConnection() again.o what could be done while using defensive programming?

  • Return the same connection to the second subscriber. This seems to be a valid approach. If the connection is already established, it is possible to just give it to the second subscriber and forget about the problem. Unfortunately, the BLE communication is usually stateful and a request-response communication model is widely used. If two subscribers try to communicate at the same time, the responses may get mixed. We all know that when the users do not understand what is going on, they tend to reach out to developers for help… and we wouldn’t want that :)
  • Deliver the connection once the first subscriber unsubscribes. Again—although it’s a seemingly legitimate tactic, users may be troubled as they may expect a connection to be returned quickly or an error. In turn, the process may seem stalled.

So what turned out to be the best approach? Read more below!

Offensive programming

When the users try to perform an action that they shouldn’t, a kind of IllegalOperationException is emitted with a verbose explanation. The documentation explains that a connection is being established and that the interference may occur. Users still may look through their code and find the place where they callRxBleDevice.establishConnection() and use the .share() operator to get the same RxBleConnection instance in more than one place but they will be explicitly informed that it’s the same connection. The key is to be precise and clear when it comes to communicating what and why is going wrong. Speaking of which…

Make exceptions meaningful

Not many things can make the user more puzzled than getting an exception without any explanation. Confused users are definitely going to get back to you, which is why it’s important to share your knowledge about what is happening to save everyone’s time. While working with the vanilla Android BLE API we have encountered numerous status codes that come from the BluetoothGatt and only few of them are documented. It took lot of time to gather enough knowledge to work with this API fluently but—since the users didn’t possess our know-how—they kept getting back to us. Initially when this kind of error happened, we emitted an exception like:

BleGattException{
  macAddress=XX:XX:XX:XX:XX:XX,
  status=133,
  bleGattOperationType=CHARACTERISTIC_READ
}

What would the user learn from it? He would know that a certain GATT exception happened on a device with a mac address during a characteristic read. But what does the 133 status mean? Where can it be found? Actually this error is described in the the source code of the Android OS. The description is still vague but at least it is possible to get the information where it originates and what it’s connected to. In order to prevent other users from spending time searching online or reaching to you for clarification a simple change may be made:

BleGattException{
  macAddress=XX:XX:XX:XX:XX:XX,
  status=133 (0x85 -> "https://android.googlesource.com/platform/external/bluetooth/bluedroid/+/android-5.1.0_r1/stack/include/gatt_api.h), bleGattOperationType=CHARACTERISTIC_READ
}

And because the library is only routing some information from the layers below without a full understanding of what went wrong (codes may change / be added between OS versions) it was a good idea to point the users to the place where they can gather more information right away. The same goes for wrapping Throwables that may be thrown by the system. The best solution is to move them into the domain of your library with the best explanation available.

Wrap up: the Developer

So what can you do as a developer to save your and the users’ precious time?

  • Think about what approach to known problems your library may take (offensive vs defensive programming)
  • Wrap errors in a way that will route the users to a place where they can find more information

PART 2—Handover

Perspective: the Maintainer

Having a well developed, meaningful, easy to use API and ways to share the knowledge with the users is great. But the time passes by and people move on. Chances are that at some point you will stop developing your module / library and some new developers will be introduced to the code. Unfortunately, they will not have all the domain knowledge you have gathered. Actually, it may also be the case for the old users as sometimes we don’t remember a code we’ve had developed few months back—our memory can trick us. So how can we preserve this knowledge? Let’s look at the example from the library. A Bluetooth Low Energy scan can not start for many reasons and it’s good to inform the user why:

class ScanPreconditionsVerifierApi24 {
    void verify() {
        this.scanPreconditionsVerifierApi18.verify();
        wasScannedTooManyTimes();
    }
}

But the maintainers as well as you will know only by reading this code that for API 24 (Android 7.0) the scan verifier can list/ find all the checks that were made for API 18 (Android 4.3) and also check if it was not called too many times—it is a simple and quite self documenting code. The thing it lacks is the context. There is no information why this call was made and for a new maintainer it will be very similar to:

class ScanPreconditionsVerifierApi24 {
    void verify() {
        this.scanPreconditionsVerifierApi18.verify();
        if (getSomeInt == 42) {
            throw new IllegalStateException();
        }
    }
}

Where 42 is just a MAGIC_NUMBER. It has no context whatsoever about why it is done this way. What you can do is actually quite easy—you can always use good, old-fashioned comment:

class ScanPreconditionsVerifierApi24 {
    void verify() {
        this.scanPreconditionsVerifierApi18.verify();
        /*
         * Android 7.0 (API 24) introduces an undocumented scan throttle for applications that try to scan more than 5 times during
         * a 30 second window. More on the topic: https://blog.classycode.com/undocumented-android-7-ble-behavior-changes-d1a9bd87d983
         */
        wasScannedTooManyTimes();
    }
}

Pretty easy, right? Whoever stumbles upon this piece of code will immediately get more insights concerning the reason it was put here and will be pointed to a place where he/she can learn more. There is one more place which is usually neglected. Commit messages in the Version Control Systems. Usually developers tend to make them as short as possible.

Fixed a bug in the login view

This is how it usually looks like. Unfortunately, this will not give much information to the people who will git blame the code in the login view to check why a particular line of code was introduced. This knowledge is lost. A way better solution would be to at least drop in a link to the issue that is connected:

[ISSUE-321] Fixed a bug in the login view.

But it could be even better! Most git clients treat the first line or the first 60 characters of the commit as its title—but the commit message itself may be arbitrarily long. Then even the people who do not have access to the issue would still get more info or they would not need to dig deeper!

[ISSUE-321] Fixed a bug in the login view
When the user started to login but have canceled
the login by going to the background the application
ended in an inconsistent state where the UI thought
it is not logged in but the user could not login
because the login manager was already logged.

If the maintainers are not interested in the details, they can only check the title but if they need more context it is available for them right away!

And last but not least:

TESTS!

Tests turned out to be one of the most important parts of the development. They are basically the safety net for you and the maintainers. They allow for major refactorings that are sometimes needed when adding new functionality without a fear of introducing regression. Testing works best when applied right at the beginning (or at least at the time of) writing a new feature code. Of course it’s not possible to test against all possible bugs upfront but a good solution is—when a new bug is found—to add a test case against it first and only fix it as you go along.

Summary

All of the above is not just empty talking. During the development of this quite long running project (especially for the mobile industry) we have made several mistakes which in turn wasted a lot of our and users’ time. These are the rules we use on a daily basis as they simply enable us to deliver better code in a shorter time. They also make life significantly less stressful for the people who work with the code!

Share

Darek

Staff Software Engineer

Did you enjoy the read?

If you have any questions, don’t hesitate to ask!

Did you enjoy the read?

If you have any questions, don’t hesitate to ask!