Diagnostic Trouble Codes

Introduction

Diagnostic Trouble Codes, abbreviated DTCs, are the foundation of automotive diagnostics. In essence, they are 3-byte IDs that represent a specific point of failure or problem in your system (to be fair, SAE-J2012 describes them as 2 bytes, whereas ISO-14229 has them as 3 bytes). If there are failures or issues somewhere in your system, then ECU software can log the failure in the form of setting the trouble code that’s associated with that failure.

The purpose of DTCs is to help repair and diagnose the complex electro-mechanical systems that are in an automobile. When a failure occurs in one of those on-board systems, a unique DTC code becomes activated. Service technicians can read all the DTCs that are active and determine which specific part of a system failed.

Take for example an ECU which interacts with an external analog sensor:

An ECU with an external temperature sensor

Internally, a well-designed ECU would make sure to interface with the sensor in such a way as to be able to diagnose a connection failure. The system designer would intentionally allocate a DTC to the failure of the sensor connection, and ECU software would periodically diagnose this connection. Finally, if a failure should occur, the ECU software will “set” the DTC that was allocated to this failure mode (we’ll talk about what it means to “set” a DTC shortly).

When the sensor is disconnected, the ECU knows because the voltage should be out of the normal range – in our example, reading 5V for a long time means the sensor is no longer connected, and ECU software can report that this diagnostic has failed

Though the above example is simple, there’s quite a bit to unpack.

DTC IDs

System designers don’t arbitrarily choose 3 bytes to represent a diagnostic failure. Instead, each byte (and in fact, each bit) has a specific meaning that helps identify the type of system that has failed. Broadly, there are 4 categories of DTCs in a vehicle:

  1. P-Codes (powertrain) – These are DTCs that relate to the torque path of the vehicle. From pedal input all the way down to final drivetrain, any failure points along the way should have an associated DTC that is a P-Code.
  2. C-Codes (chassis) – These are DTCs that are set by components that aren’t in the torque path but aren’t part of the passenger cabin either. They include systems like steering, suspension, and brakes.
  3. B-Codes (body) – These DTCs are related to the systems inside and surrounding the passenger cabin, like window motors, cabin heaters, and windshield wipers.
  4. U-Codes (communication) – Finally, U-Codes are the set of DTCs that deal with communication between different ECUs, and any shared components between them. Failure modes like loss of communication, faulty communication, and implausible signals all come under this category.

In the 3 bytes of a DTC, the upper 2 bits represent the DTC type (P, C, B, or U). The rest of the bits also have a defined meaning:

The breakdown of a DTC ID – As engineers we’re concerned mainly with the hex value and what each bit means. However, the display value (what is externally reported to technicians and test tools) isn’t the hex code. In the case of the above, 0xD81721 would be displayed as U1817-21.

Tests, Monitors, and Operation Cycles

Before we get into the innards of a DTC, we need to cover some definitions relating to on board diagnostic systems. In an automotive setting, these terms have a very specific meaning, and need to be understood to grasp the concept of diagnostics.

Test

At the core of diagnosing something is performing a test on it to see if its working or not. Some tests are exceedingly simple, others get very complicated. Looking again at our sample ECU and sensor setup, the code that performs the test could be as simple as:

#define SENSOR_HIGH_THRESHOLD       ( 4950 )
#define SAMPLE_WINDOW_SIZE          ( 5 )

/**
 * @brief Performs the out of range diagnostic on the sensor
 * @param [in] sample_mv - fixed point voltage sample in millivolts
 * @return - True if out of range, False otherwise
 */
bool test_sensor_disconnect(uint32_t sample_mv)
{
   /* static variables */
   static int32_t sample_window_position = 0;
   static bool sample_window [ SAMPLE_WINDOW_SIZE ] = { 0 };
   static bool current_state = false;
   
   /* local variables */
   bool current_sample = false;
   bool out_of_range = true;

   /* check if the current sample is out of range and has failed the diagnostic */
   if (sample_mv > SENSOR_HIGH_THRESHOLD)
   {
      /* this is out of the normal range, and so we say this sample has a failure */
      current_sample = true;
   }

   /* put the sample into the sample window and update position */
   sample_window [ sample_window_position ] = current_sample;
   sample_window_position += 1;
   sample_window_position %= SAMPLE_WINDOW_SIZE;

   /* loop through the sample window, if all samples have failed, 
    * report back that the diagnostic has failed, otherwise assume 
    * it has passed */
   for (uint32_t index = 0; index < SAMPLE_WINDOW_SIZE; index += 1)
   {
      if (sample_window [ index ] == false)
      {
          out_of_range = false;
      }
   }

   return out_of_range;
}

The definition of a test, then, is a set of actions that determine whether some part of your system is operating as expected or has failed. The output of a test, then, is a boolean value; pass, or fail. All diagnostics in a system, no matter how complicated or simple, always end in producing this single boolean value. Each one of these pass or fail values are referred to as test samples. Notice that inherent in those samples can be some debouncing behaviour. This doesn’t have to be the case, however.

Monitor

Now that we know what a test is, and what a simple one looks like, we’re ready to proceed to the definition of a diagnostic monitor. Simply put, a monitor is a set of a one or more tests, which together, can diagnose whether a portion of a system is working as intended, or is faulted. There is a 1:1 mapping of DTCs to monitors. Each monitor reports the state of one part of a system to a single DTC code.

Many monitors consist of just one test. For the example above, the monitor could simply be:

/**
 * @brief - The monitor function for this sensor, calls the test, and reports the 
 *          DTC
 */
void perform_sensor_diagnostic(void)
{
   uint32_t sensor_value_mv = iohwabs_get_sensor_voltage();
   bool sensor_diagnostic_failed = false;

   /* perform all the tests that involve this sensor */
   if (true == test_sensor_disconnect(sensor_value_mv)
   {
      sensor_diagnostic_failed = true;
   }

   /* If the sensor diagnostic failed, report it to the central DTC manager. In this
    * project, the DTC manager takes the hex code of the DTC you wish to report as 
    * the first argument, and the failure status as the second argument */
   if (true == sensor_diagnostic_failed)
   {
      report_dtc_status(0xd81721, true);
   }
   else
   {
      report_dtc_status(0xd81721, false);
   }
}

As you can see, a monitor is running the test to diagnose the system (in this case, a sensor). A monitor could have more than one test though. Imagine our same sensor system, but now the DTC associated with that sensor is now more general, and can include multiple points of failure:

/**
 * @brief - The monitor function for this sensor, performs available diagnostics.
 *          If the diagnostic failed, report it
 */
void perform_sensor_diagnostic(void)
{
   uint32_t sensor_value_mv = iohwabs_get_sensor_voltage();
   bool sensor_diagnostic_failed = false;

   /* perform all the tests that involve this sensor */
   if (true == test_sensor_disconnect(sensor_value_mv)
   {
      sensor_diagnostic_failed = true;
   }
   if (true == test_sensor_stuck_low(sensor_value_mv)
   {
      sensor_diagnostic_failed = true;
   }
   if (true == test_sensor_stuck_in_range(sensor_value_mv)
   {
      sensor_diagnostic_failed = true;
   }

   /* If the sensor diagnostic failed, report it to the central DTC manager. In this
    * project, the DTC manager takes the hex code of the DTC you wish to report as 
    * the first argument, and the failure status as the second argument */
   if (true == sensor_diagnostic_failed)
   {
      report_dtc_status(0xd81721, true);
   }
   else
   {
      report_dtc_status(0xd81721, false);
   }
}

Operation Cycle

The concept of an operation cycle is a bit tricky, and the phrase is somewhat loaded. If you talk to powertrain engineers who work on gas or diesel engines, an operation cycle is a well-defined set of states that the engine goes through. First the engine turns on, then it idles, then it achieves some minimum speed for some minimum time, and then eventually it turns off.

Why not just have operation cycles be key cycles? There’s a lot of regulatory requirements behind what an operation cycle means for a vehicle that has emissions. Basically the spirit of those rules is that an operation cycle has to be long enough and complex enough so that all monitors associated with it are run. Therefore, our definition of an operation cycle is the start and end of the conditions of a vehicle for all the monitors associated with it to run.

A vehicle can have many different subsystems that have their own operation cycles. An example is a hybrid car with an engine and an on-board charger. It doesn’t make sense to associate the DTCs of the on-board charger with the operation cycle of the engine, because the operation of one doesn’t mean the monitors of the other will get exercised. Running the car engine may not necessarily run any of the tests for the on-board charger. Similarly, charging the car with the engine off wouldn’t run any of the tests associated with the engine. Operation cycles are important because of the property that a monitor had to have run at least once in an operation cycle. Knowing that a diagnostic failed doesn’t tell you if it’s a current failure or a past one. Knowing a diagnostic failed in the last operation cycle immediately narrows that down.

DTC Status Byte

Now we get into what it means to “set” a DTC. Let’s bring back our simple system from the introduction:

Simple ECU with an external sensor

When the ECU software detects that the sensor has been disconnected, it wants to report that failure. Remember, the end goal of all these DTCs is to help diagnose exactly what went wrong in the system. With that in mind, when a service technician or engineer reads the state of all DTCs on a vehicle, it’s not enough to simply report which ones had a failure. As a technician or engineer trying to diagnose the root cause of the failure, you need more information, such as when did it fail, what was the last time it failed, is it failing every time or is the problem intermittent, etc. Put simply, a simple boolean pass or fail is insufficient for this.

Queue the DTC status byte; though one boolean is insufficient to represent the state of a DTC, it turns out 8 are enough. Together these 8 bits form one status byte, which is descriptive enough to represent a DTC. The 8 booleans are:

PositionNameAbbreviationDescription
0testFailedTF0 – Last test passed
1 – Last test failed
1testFailedThisOperationCycleTFTOC0 – No test has failed since this operation cycle began
1 – At least one test has failed since this operation cycle began
2pendingDTCPDTC0 – No test has failed this operation cycle or the last
1 – At least one test failed this operation cycle or the last
3confirmedDTCCDTC0 – The test didn’t fail enough to warrant setting a DTC
1 – The test has failed enough times that the DTC is now set
4testNotCompleteSinceLastClearTNCSLC0 – At least one diagnostic test has completed since the last time DTCs were cleared
1 – A diagnostic test has not yet run to completion since the last time DTCs were cleared
5testFailedSinceLastClearTFSLC0 – A test has not failed since the last time DTCs were cleared
1 – At least one test has failed since the last time DTCs were cleared
6testNotCompleteThisOperationCycleTNCTOC0 – At least one test has completed this operation cycle
1 – No test for this DTC has been run since the beginning of this cycle
7warningIndicatorRequestedWIR0 – No malfunction light is being requested from this DTC
1 – The malfunction light should be illuminated due to this fault
DTC Status Byte Definition
Aside – you may have noticed the bits testNotCompleteSinceLastClear and testFailedSinceLastClear. Clearing all the DTCs present on a car is done through a UDS service (0x14). Doing so should reset all the stored DTC status bytes to 0x50 (indicating the test hasn’t been completed now since the last clear, or in the last operation cycle). For OBD systems, this gets a bit more complicated, but if you work in EV’s it’s less of a concern.

Each DTC in the system has associated with it one DTC status byte. The DTC code, along with the status byte, provide enough information to a technician or engineer to know what went wrong in a system. Therefore, when we say “set” a DTC, we mean that we should report the fault associated with the system that failed, and that the individual status bits of that DTC should be updated accordingly.

To make this concrete, consider again our simple system with an ECU connected to an external sensor. Let’s assume that there is a DTC to indicate that the sensor has been disconnected from the ECU. At the very beginning, right after you plug in your ECU for the first time (or clear DTCs), the state of the DTC could look like:

DTC Status byte is 0x50, indicating the tests associated with this DTC have not even run yet.

Say there is nothing wrong with the sensor – then the tests will begin to pass:

DTC status byte is 0x00, indicating that the tests were run at least once, and they all passed.

Finally, let’s introduce a fault in the system and see how the DTC status byte changes:

In this example, testFailed, testFailedThisOperatingCycle, pendingDTC, and testFailedSinceLastClear all get set.

There’s a certain nuance to the PDTC bit and the CDTC bit. Basically, some DTCs take multiple operating cycle of failures before they become “confirmed” (indicating a real problem with the system). In those cases, the pending bit shows that the tests are failing, but haven’t yet failed across enough operation cycles to warrant a warning. The above example system did not set the CDTC bit, and hence we can assume the DTC associated with that sensor is a multi-trip fault. If on the next operating cycle the test fails again, the CDTC bit would get set. We call this kind of DTC a two-trip fault (it takes two trips with a failing test to confirm).

As you can see, with the additional information provided by the status bits, an engineer can quickly determine the state of system. If for example the next time the car comes in for service, or the next time DTCs are read remotely through telematics, they can tell immediately which systems were failing and if they are still failing, providing a powerful, standardized means for diagnosing problems in the car.

Conclusion

We’ve touched on the basics of what DTCs are, and with it a few basic concepts relating to automotive diagnostics, like tests, monitors, and operation cycles. We’ve also seen what’s in a DTC status byte. In a follow up post, we’ll delve more deeply into these concepts and introduce the common software abstraction used to manage all of this: the Diagnostic Event Manager (DEM).

UDS Data By Identifier (DID)

Part of a series on UDS – if you missed it, check out the introduction to the protocol.

What is it?

Data by Identifier, or DID for short, is a core component of building out ECU diagnostics and being able to troubleshoot a malfunctioning ECU. A DID is essentially a piece of data that is given a 16-bit identification code – hence the name data by identifier. This 16-bit ID uniquely binds a piece of data, its length, and its interpretation together into a DID. What each DID represents is entirely up to the purview of the system designer. You may chose to represent internal variables with separate DIDs to facilitate easy read access. You may also combine several items of data into a single DID. Many designers also make use of bit-fields to show the state of a sub-system via a series of booleans.

For example, see our fictional ECU’s list of supported DIDs:

Data IdentifierNameLength (Bytes)Interpretation
0xF190Vehicle Identification Number (VIN)1717 bytes of ASCII Data representing the VIN
0xF170Board Part Number 4Unique 32-bit board part number
0xA001System Voltage4Fixed point value. Scaling of 0.01V, offset of 16384 counts
0xA108Door Ajar Status1Bit-field (0 is closed, 1 is open)
Bit 0 – Front Driver Door Status
Bit 1 – Front Passenger Door Status
Bit 2 – Rear Driver Door Status
Bit 3 – Read Passenger Door Status
Bit 4 – Trunk Status
Bit 5 – Hood Status
Bit 6 – Gas Tank Status

How’s it work?

There are several UDS services that use DIDs. The two most popular are Read Data By Identifier, which is service ID 0x22, and Write Data by Identifier, which has service ID 0x2E. Like the names suggest, service 0x22 is used to read data from the ECUs, and service 0x2E is used to write data into the ECUs. Depending on the DID, it can be either readable, writable, or both.

Read Data by Identifier (0x22)

For service 0x22, there is no sub-service – rather the bytes following the service ID are interpreted as Data ID’s.

As can be seen, the client requests a Read Data by ID of ID 0xA001 (system voltage from the table above) from the server. The server replies with a positive reply (0x62), the ID of the data identifier (0xA001) and finally the data itself. Clearly this is an essential service used for reading predefined values out of the ECU.

An interesting feature of this service, which engineers often omit while designing and testing their UDS servers, is that multiple DIDs can be requested at a time by the tester. This is done simply by adding more IDs in the request. The response can then be decrypted based on the lengths of the individual DIDs.

Notice how the tester issued a read request for DID 0xA001 and 0xA108? And the server replied back with the data for 0xA001, and then in the same reply, the data for 0xA108.

Write Data by Identifier (0x2E)

Analogous to the Read DID Service, Write Data by ID allows the client to write data into predefined values. For example, programming the Vehicle Identification Number (VIN) into an ECU is often done through this service.

As can be seen above, the client requests a Write Data by ID service (0x2E), followed by the ID of the variable to be written (0xF190), and finally the correct length of data required for this variable (17 bytes). The server then validates the data being written (length checks, value checks, etc.) and if all went well, replies with a positive response. Depending on the sensitivity of the data being written or read, individual DIDs may be gated behind UDS security.

Conclusion

Data Identifiers provide an easy way to choose what data can be read out of an ECU, or written into it, by mapping 16-bit IDs with predefined information and lengths. Typical automotive ECUs can have 100’s of DIDs that show information as diverse as internal variables to board part numbers. Among the most popular services related to DIDs are Read Data by Identifier (0x22) and Write Data by Identifier (0x2E).

UDS Security Access

Part of a series on UDS – if you missed it, check out the introduction to the protocol.

What is it?

Many UDS services, which when executed, could create an unsafe state in the vehicle or reveal sensitive information about the ECU or vehicle. We automotive engineers thus want to ensure that only qualified, authorized personnel run some of these services. For example, perhaps we don’t want just anyone to flash software on our ECU, or write in a new serial number. It is these concerns that motivate the use of UDS service 0x27 – security access.

UDS service 0x27 is a means to prove the identity of the UDS tester so that the ECU knows that UDS requests coming from it are from an authorized client. Using this service, the ECU can require certain services, sub-services, or even certain parameters of a service to be used only in conjunction with security access. This makes sure that those select UDS functions can only be executed by an authorized person.

How’s it work?

All security comes down to proving your identity. You can do this by something you have, something you know, or something you are. Something you have would be like a physical token (like your house key). Something you know could be a password or an algorithm. And something you are would be like your fingerprints or iris scan. It should be pretty obvious that UDS security relies on something you know in order to prove your identity.

The “something you know” in this case is almost exclusively a secret algorithm, or better still, a well known algorithm with a secret key. It starts with the tester issuing a request for the server (the ECU) to generate a seed. The seed should be some agreed upon number of random bytes that is unpredictable (a nonce). The tester, using its knowledge of the secret algorithm or key will then take the seed and generate an unlock key. It then sends that back to the ECU, which, if correct, will grant the tester access to the ECU’s secure services.

UDS security also allows you to define levels of security access, denoted by the sub-function with which you use service 0x27. These levels are arbitrary, and can be used to silo different services behind different levels. An example would be for a given level you have one secret key that you share with a supplier. Then any UDS services or sub-services you gate behind that level can be used by that supplier. But other services and sub-services that use a different level won’t be available to that supplier. Let’s take a look at an example unlock sequence:

As always, lets analyze each step that occurred in the above sequence:

  • The tester sends a request to unlock the ECU using UDS service 0x27, subservice 0x01. This indicates that the security level for which the tester is applying to unlock the ECU is level 1.
  • The UDS server receives this request, and assuming conditions are correct, generates the random seed. It returns this seed as part of the positive response to service 0x27 0x01 (for a reminder on how the UDS protocol works, see UDS Protocol). In this case the random seed was 0D 42 8C 91.
  • The UDS client, with its secret knowledge of the algorithm used to unlock the ECU, calculates an unlock key. In this case, the secret algorithm was simply flipping all the bits of the seed to get the key. Notice now the tester sends the secret key using service 0x27, subservice 0x02. The initial request’s subservice is always an odd number. The subsequent unlock key is sent using that number + 1. Then the unlock key is sent as a parameter.
  • If the unlock key the UDS client has provided matches what the UDS server expects, it sends a positive response to the unlock service. Otherwise, it would send a negative response code here indicating the reason for failure.

If all went well, the ECU can assume that the UDS Client is now authorized up to security level 1. Any UDS features that are blocked by this security access can now be used, whereas they couldn’t be used before.

UDS security access is linked to UDS sessions. When a UDS session expires, so to does the security access that was obtained while in that session. This means if you come and unlock an ECU, then wait too long without sending any UDS commands, you’ll likely timeout and lose your security access.

The sub-function of UDS Security Service 0x27 is the “security level” that is being unlocked. As mentioned above, the concept of a security level is somewhat arbitrary. Level 9 doesn’t necessarily mean you are granted any more or any less privileges than level 3. It all depends on the system designer, and what services and sub-services they feel should be protected, and who gets to run those services.

For example, you may choose to designate security level 1 for all things related to writing manufacturing information into the ECU (i.e. serial number, date of manufacturing, VIN, etc.). You may then designate security level 3 for all things related to writing ECU runtime data (i.e. tire radius for a vehicle controller, or battery capacity for a battery management ECU). The choice is yours – the spec allows you to have 33 levels.

Last thing – a security level forms the relationship between the “request seed” sub-function and the “send key” sub-function. We generally refer to the level as the sub-function provided in the “request seed” portion of the sequence. For instance, if we want to unlock security level 9, our seed request looks like 0x27 0x09. The subsequent key send must then be 0x27 0x0A. That’s also why security levels are always odd numbers.

Other Considerations

The purpose of this UDS service is to authenticate a tester, as unlocking the ECU’s security level gives the tester access to other UDS services that can be dangerous or reveal secret information. Therefore the implementation if this service should be done with security in mind. A few points to consider:

  • One should use the NIST Guidelines for Authentication when implementing this service.
    • The algorithm should be a well-known cryptographic based authentication mechanism, such as a CMAC or HMAC based on a secret key.
    • The implementer should stay away from “security through obscurity” even though it appears tempting.
  • Timing is important. The UDS server should implement strict timing for how long the tester has to calculate the unlock key. Exceeding this time should be followed by a negative response code.
  • Failed security unlock attempts should force a time-delay between subsequent attempts. This makes brute-force attacks for guessing the unlock key more difficult. Time delays should be stored in some form of persistent memory, so that power cycling the ECU won’t reset the time.
  • When generating the seed, make it as random as possible. Many modern microcontrollers come with cryptographic engines as peripherals – these should be used as often as possible when generating random numbers. If this isn’t available, there are software algorithms that can generate pseudo-random numbers, but they need a reliable source of entropy. Usually a free-running timer in conjunction with a few ADCs is sufficient. If an attacker captures your seed + key unlock sequence, and at a later time your ECU produces the same seed when requested, the attacker can simply present the old key and unlock your ECU.

Conclusion

UDS security access is a service that let’s ECUs know that they are talking to an authenticated UDS client. Without it, anyone could access all UDS services, potentially leading to danger or leaked information. The system designer chooses which UDS services can be used with and without security. There are multiple levels of security available as sub-functions of service 0x27, which can be used to group services and sub-services together into security domains.

Diagnostic Sessions and Tester Present

Part of a series on UDS – if you missed it, check out the introduction to the protocol.

What’s a Session and Why do I Care?

UDS Service 0x10 – Diagnostic Session Control, is one of the most widely used diagnostic services and core to our understanding of the UDS protocol. Session control is a way to enter different modes of operation in the diagnostic server (located on the ECU) and to group services together into logical containers, known as sessions. Different services may or may not be allowed to operate in different sessions, and different sessions may not be allowed to be entered in different modes of operation. Hence, gating certain services behind a session ensure that they can only be executed when the operating mode of the ECU is conducive to the execution of that service.

A simple example is defining a session called Programming Session. Our intention behind defining this session is to make a container of services related to (you guessed it) programming the ECU. In order to enter programming session, we can define a set of conditions to make sure it is safe to temporarily take the ECU offline while we reprogram it. These can include, among others, the vehicle at standstill and transmission in park. By grouping several UDS services together and enabling them based on session, we can avoid redundant checks for each service. That is to say, we can check all the preconditions related to programming an ECU once as part of switching sessions. Then, all the services gated by that session can all be assumed to be operating within the same safe state.

Diagram showing simplified session change flow

Most UDS servers will gate services 0x34 (requestDownload), 0x35 (requestUpload), 0x36 (transferData) and 0x37 (transferExit) behind the programming session. Looking at the above diagram, that means trying to execute a requestDownload before switching to programming session will result in a negative response.

This implicitly requires every UDS service first be checked to see if it is supported in the current session as part of the UDS server implementation. This also implies that there must always be at least one diagnostic session running at all times. This second requirement manifests itself in the default diagnostic session, which is equivalent to a session idle state. See below for a typical example of the different diagnostic sessions and transitions on an ECU:

Example Session State Diagram

Notice how the ECU is always in the default session state unless otherwise requested. The other transitions shown above are exemplary, and are up to the system designer to specify. Note too that the above states are prescribed in ISO14229 (see table below), but they allow for custom user sessions to be defined as well.

ParameterNameDescription
0x01Default SessionThe idle state – this session is always active unless explicitly made to switch.
0x02Programming SessionSession related to (re)programming the ECU or otherwise transferring data (data uploads are also usually gated by this session).
0x03Extended Diagnostic SessionSession related to UDS services that should only be run under prescribed conditions, because they alter the behaviour of the ECU itself.
0x04Safety System Diagnostic SessionSession related to UDS functions that, if operated incorrectly or outside of well defined conditions, may cause harm to the ECU, vehicle, or occupant.
Session types defined in ISO14229

Finally, each session has a set of performance requirements, which are reported by the ECU back to the tester alongside a positive response when switching sessions. Specifically, these performance requirements are the P2server_max time and the P2*server_max time. P2server_max is the time requirement that the server (the ECU) has to reply to a UDS request from the client. P2*server_max is the time requirement for the ECU to send out a UDS response after it has sent out a pending response (see my post on UDS Protocol for what a pending response is). These timing requirements are reported by the ECU and are applicable for the session that the ECU is entering. Different sessions may want to have different performance requirements, which is why this option is given. For example, the default session may have P2server_max as 50ms, because we expect a speedy response for most services. But perhaps P2server_max in the programming session could be increased to 200ms, because we know that some of the UDS services related to programming may consume extra CPU load, and we’d like to give the ECU more time to form a response.

Tester Present

Why is tester present included in this discussion? To answer that, I’ll introduce one final timing parameter: S3server. This is the maximum amount of time you can be in a non-default diagnostic session without receiving a UDS message, before automatically jumping back to the default session. Put simply, each ECU is expected to maintain an internal timer as soon as it’s in any session other than the default session. Any UDS message refreshes this timer – but if S3server time elapses without receiving a message, the ECU shall jump back to the default session.

Instead of forcing the tester to constantly perform arbitrary UDS commands just so the session doesn’t time out, UDS offers a service called Tester Present (0x3E). It literally does nothing except inform the ECU that the tester is still present (hence the name). However, that UDS message is sufficient to refresh the ECU timer, and keep the session from expiring. Many automotive tools that speak UDS will automatically send a tester present message every few seconds (usually 1 second less than S3server to give some buffer) while in a non-default session.

Leaving a Session

Some of you may have noticed a loophole in the way I’ve described sessions thus far. Suppose a certain UDS service is only available in extended session. What if the tester enters extended session, starts that UDS service, and then leaves extended session?

The answer is that upon leaving any session, the ECU shall check whether the new session supports all active UDS services. If any of those services are not active, it must inhibit them from running in the new session.

For example, say we use UDS service 0x28 to turn off communication in the extended session. Upon leaving extended session, the communication must automatically turn itself back on, without the need for the UDS client to explicitly tell the ECU to do so.

Protocol Details

Protocol Flow for Session Control

Let’s analyze each step in the diagram above. We see the ECU always starts in the default diagnostic session. The tester sends the UDS command 10 03. We know from the UDS Protocol that this is service ID 0x10, and parameter 0x03, which means we’d like to request the ECU to switch sessions (0x10) to extended diagnostic session (0x03). The ECU receives the request and performs the requisite logic internally to see if it is safe to switch sessions. As soon as the ECU determines its okay to switch sessions, it forms the UDS response 50 03 00 32 02 58. We decode 50 03 as positive response to the session switch command. Then 00 32 is the P2server_max time in force for the new session, and 02 58 is the P2*server_max time. These are in hex, and we can convert them to decimal to yield 50ms for P2server_max and 600ms for P2*server_max. Note that these new performance requirements are in force as soon as the ECU sends the response containing them. From the braces on the right of the diagram, we can see that the ECU is now in Extended Diagnostic Session.

Simultaneously with the positive response to switching sessions, the ECU starts its internal S3 clock. We see that the tester sends one Tester Present message to the ECU (3E 00). This resets the S3 time. In fact, any UDS request would have reset this timer.

Finally, the ECU’s timer expires, and it jumps back to the default session. The tester would now have to repeat the change session command to get the ECU back to Extended Diagnostics. Any active UDS services that aren’t allowed to run in default session must now be inhibited.

Conclusion

UDS session control is a powerful tool that we can use to group UDS services together and to gate ECU functionality on. ECU functions and UDS services can be selectively turned on or off depending on which session it is in, and any logic can be performed in the ECU to determine if its allowed to enter a session. With this we ensure the safe operation of our ECU.

UDS Protocol

Protocol Description

UDS is a simple client-server centered protocol, with the client sending UDS requests and the server replying with UDS responses. The protocol itself is split into different types of requests, called UDS services, which in turn may (or may not) have sub-services.

Hierarchy of UDS datagrams

The client initiates a UDS request by sending the server a series of bytes on a reserved communication channel to the ECU. This could be a special CAN identifier, a specific IP address and port, or any other specific, unique, communication channel only used by UDS. The reason for this is the ECU will listen specifically to this communication channel for UDS requests, and will interpret the data coming in according to the UDS protocol.

UDS Service

The first byte in each UDS request is the Service ID. This byte represents the type of UDS service the client wants the ECU server to perform. In total, there are 27 services, each asking the ECU to perform some special task:

Service NameByte ValueDescription
Diagnostic Session Control0x10UDS sessions are logical containers that act as gateways to enabling and disabling other services.
ECU Reset0x11This service instructs the ECU to perform a reset of itself, if safe to do so.
Security Access0x27Security allows the ECU to authenticate the tester tool to make sure it is authorized to command UDS services. Many services will not run unless the tester has already authenticated themselves using this service.
Communication Control0x28Communication control allows the tester to shut off regular communications on a chosen channel. It is often used before a major download or upload sequence, to free up bandwidth on the communication medium.
Tester Present0x3ETester present is like a keep-alive signal between the client and the server. Every time interval, the client sends a tester present message, indicating that the tester is still present (I know, super not obvious).
Access Timing Parameter0x83There are many key timing parameters in the UDS protocol dictating, among other things, how long the server has to respond to a command, and how often a client needs to refresh the tester present signal. To read these timings, or set new ones, this service can be used.
Secure Data Transmission0x84Authenticating the tester through service 0x27 will guarantee the legitimacy of the tester. However, individual messages may still go through on the wire un-encrypted. If there is a need to encrypt the UDS request or response, this service can be used.
Control DTC Setting0x85Diagnostic Trouble Codes (DTCs) indicate some failure detected by the ECU, and are generally not nice to have. During certain servicing events, the tester may want to temporarily turn off detection of DTCs, for which they’d use this service.
Response on Event0x86Response on event is an interesting service that has many cool use cases. It essentially tells the ECU to broadcast a UDS message when an event occurs. The type of event which produces the message can be programmed.
Link Control0x87Link control can be used to change the communication parameters between the client and server, for the purposes of increasing bandwidth. This could be a speed change (e.g. change the CAN bus baudrate from 500K to 1Mb), or a change in schedule table for LIN, or any other specified communication change.
Read Data by Identifier0x22Probably the most widely used UDS service, it allows the client to request a piece of data from the ECU using a special moniker (called the data identifier). The data identifier is generally a 16-bit number. The type of data can literally be anything (engine speed, software version, board serial number, etc).
Read Memory by Address0x23It’s what you think it is. Allows the client to read the data at a specified memory location on the ECU.
Read Scaling Data by Identifier0x24This service returns the scaling factor for the data from service 0x22. The usual use is when the data returned from 0x22 is scaled into the form of an unsigned integer. Applying the scaling factor from this service can convert it back to a floating point value.
Read Data by Periodic Identifier0x2AJust like service 0x22, but instead of replying once with the data, this service instructs the ECU to continually broadcast the requested data.
Dynamically Define Data Identifier0x2CUsually the data ID’s used in service 0x22 are predefined by the ECU. However, using this service, the client can define new data identifiers containing sets of already existing IDs.
Write Data by Identifier0x2ELike service 0x22 is used to read defined pieces of data, service 0x2E is used to write in pieces of data from the client to the server. For example, 0x2E can be used to write in a serial number onto the ECU.
Write Data by Address0x3DAllows the client to write data to the ECU’s memory at a specified address.
Clear Diagnostic Information0x14When the ECU detects an error, it records a Diagnostic Trouble Code (DTC). Generally, DTCs stay in the ECU’s memory until the error hasn’t been present for some number of key cycles, or when the tester clears them using this service.
Read DTC Information0x19Allows the tester to read all the faults and their related details from the ECU – possibly the second most widely used service.
Input / Output Control by Identifier0x2FLike all the other “identifier” services, this allows the client to identify some I/O on the ECU by a special moniker and control it temporarily – for example, turning on a switch or overriding a PWM signal.
Routine Control0x31A favorite of the automotive world, this service allows you to do anything. Only your imagination is the limit here, but this is generally used to exercise some sequence of actions in the ECU for testing purposes (i.e. run the windshield wiper motors at half speed for 15 seconds).
Request Download0x34Used by the client to tell the ECU that it would like to download some data on to the ECU’s memory – used as part of a UDS flash sequence for updating firmware.
Request Upload0x35Used by the client to tell the ECU it would like it to upload data from its memory back to the client.
Transfer Data0x36After a successful upload / download request, this service transfers the payload between client and server.
Request Transfer Exit0x37After a successful upload / download request and all payload information being transferred, this request states there is no more data required.
Request File Transfer0x38If the ECU employs a proper file system then this service is much better suited than 0x34 or 0x35, as it allows the exchange of a file abstraction, rather than a simple block of hex data.
Authentication0x29Brand new in the 2020 edition of ISO14229, this service is meant to bridge the gap with today’s new cybersecurity requirements by allowing a client to use PKI to authenticate itself.

There will be posts that go into detail on each of these services, complete with typical use cases, tips and tricks, and best practices.

Request Format

As mentioned above, the first byte in what a client sends to the ECU according the UDS protocol is the Service ID. The bytes that follow will then be interpreted as sub-services and / or parameters of that service ID. Let’s see an example of a client request:

A typical UDS client request

As we see above, the client sends the bytes 19 02 01 to the ECU. If we dissect this, we can interpret the bytes as follows:

  1. 0x19 – From the table above, we see the client is requesting to read DTC information from the ECU.
  2. 0x02 – This service happens to have sub-services defined for it, and the second byte after the service ID is reserved for the sub-service ID. In this case, 0x02 is interpreted as sub-service 2, “Read DTC status by mask” (more on this later).
  3. 0x01 – This byte is now interpreted as a parameter to the sub-service “Read DTC status by mask”.

Response Format

Now that we’ve seen an example of a typical UDS request, let’s examine the possible values we can expect in return from the ECU.

Positive Response

A positive response is denoted by the ECU sending back as the first byte the requested service ID + 0x40. Memorize that rule – it comes in quite handy for deciphering logs. The bytes that follow will then be specific to the requested service, but in general either contain the requested data or simply an acknowledgement that the service was executed.

For example, let’s assume that our request from above was correctly processed by the ECU and resulted in a positive response. The return bytes would then be:

A partial positive response to our typical UDS client request

Remembering our first rule, we immediately recognize that 0x59 is 0x19 with 0x40 added to it, and hence this is indeed a positive response to our last service request. The other bytes are specific to the “Read DTC status by mask” sub-service.

Negative Response

A negative response is denoted by the first byte of the response being 0x7F. The second byte will then always be the original service ID to which this negative response was in reply to. Finally, a third byte is included, called the Negative Response Code (or NRC, for short). The NRC is the ECUs way of telling you why your request could not be fulfilled.

For example, suppose our request from above was met with a failure. Then, the returning bytes may look like:

UDS Server replies back negatively with response code 0x11

A great list of possible NRC’s has been compiled here. As you decipher more and more logs, many NRC’s will begin to look familiar, and you likely won’t need a manual by your side all the time.

Response Pending

There is a third category of response (at least to me, some may disagree). Although technically a negative response, there is an NRC reserved for what is known as “response pending”. This is the ECUs way of informing you that up until now, you request has neither been refused nor denied. Rather, the ECU needs more time to give you a final response, which could still be negative or positive.

For example, below we see a transaction where the ECU replied with “response pending” for some time before rejecting the request:

NRC 0x78 indicates “Response is Pending” until ultimately the UDS server rejects our request

And below we see several iterations of “response pending” before the ECU returns a positive response:

NRC 0x78 indicates “Response Pending” until a positive response is received from the ECU

Conclusion

Now that you know the basics of the protocol, you can begin diving into each specific service, what it means, and how to utilize it. Look out for more posts where I delve into all the services and tie them together to show typical UDS session flows.

Unified Diagnostic Services Introduction (UDS)

What is UDS?

Chances are if you’ve worked in automotive software, or have an interest in cars, you’ve come across the term UDS. At its core, Universal Diagnostic Services, more commonly known as UDS, is a set of special functions that every vehicle controller is expected to implement to allow external entities the ability to query it. Queries can range from “how fast are you spinning your motor” to “give me 20 bytes of data located at address 0xA0000000” to “download 256 bytes to a location I previously specified”. Every one of these queries takes the form of a UDS request, and the reply back from the controller is a UDS response.

Let’s Get Some Terms out of the Way

When starting the conversation about UDS, its helpful to align on some terms and assumptions. First off, the going assumption is that we are talking about UDS in the context of automotive electronic control units, otherwise known as ECU’s.

Now, when talking about UDS, what we’re talking about is the protocol that is specified in ISO14229. That is, a special set of bytes exchanged over some medium (UDS is technically medium agnostic, meaning we could be talking UDS over CAN, FlexRay, LIN, Ethernet, whatever). UDS is a client-server driven protocol, with the ECU always acting as the server, and the entity sending requests to the ECU being named the client. For historical reasons, many in the industry will refer to the UDS client as the tester.

If you ever get confused about which one is the client and which one is the server in this relationship, just think of it like websites. The thing with all the goodies on it is the web server, and the guy who wants all the goodies is the web client. Similarly, the ECU has all the information you actually want, and so it is the server. And you, the requester, are the client.

So why do we call the UDS Client the tester? It starts with why we need UDS in the first place. An automobile is an incredibly complicated machine – its literally an orchestra of electrical, mechanical, and software elements all working in perfect harmony to provide propulsion (and these days a whole lot more). As such, when things go wrong, it can be incredibly difficult to figure out why. Imagine pulling in your car to the mechanics because you hear funny noises coming from the engine. The mechanic might have an idea of what the problem could be based on the noise, but more likely than not she’ll have to ask the on-board ECU’s to retrieve as much information as possible about the state of the car. She’ll likely use different functions provided by UDS to test the portions of the car she thinks are at fault. And because automotive folks aren’t known for their creative naming, the entity running these UDS requests has been colloquially named the tester.

Finally, UDS itself is broken down into varying functions called services. A service is simply some functionality that is provided by the ECU to the tester. Like the graphic above shows, there is a service to read pieces of information. Similarly, there is a service to write in pieces of information from the client to the server. An example of this could be the serial number of the ECU (this could be done during manufacturing, in which case the tester isn’t a mechanic, but rather a computer on the assembly line). There are even services that are used to reprogram an ECU after its been installed in a vehicle. We’ll go through in detail each service in later posts.

What’s the Point of UDS?

As alluded to above, the stated purpose of UDS is to help mechanics repair your car. Of course, there are many secondary uses to UDS which make it such an important topic in automotive circles – but the primary goal of UDS is to allow the service personnel to capture all the relevant data from your vehicle to help them figure out what, if any, problems are present. As an interesting side note, with many American states legislating right to repair laws, these service professionals could soon be regular folks like me and you.

Secondary uses of UDS are widespread throughout the industry, and UDS is often adapted by engineers to perform all the functions outside the normal operating scope of a particular ECU. For example, many manufacturers (all the way from OEM’s down to Tier 2’s) will use UDS services in their assembly plants to ensure the component just manufactured is operating correctly. Sometimes test engineers will have access to special UDS services that allow them to gather deep insights into the software running on their controllers during debugging. And as mentioned earlier, UDS provides key functionality for updating the software of an ECU.

What’s Next?

Now that we have an abstract understanding of what UDS is and what it’s used for, we can take a deeper dive into the actual protocol behind it. That’ll be in the next post: The UDS Protocol.