Top 5 Considerations in Choosing a Unique Identifier (UID)

As a lab operation grows, so does the need to have unique identifiers (UID) for each sample to track its processing and associate particular biological / chemical properties and analytical results.  Often, the first step in implementing a new process or adopting a laboratory information management system (LIMS) is choosing a unique identifier, specifically the length and format.


Choosing a UID

1. Number of Samples 

While this may seem obvious, the more samples your lab processes in a given time, the longer your UID will need to be.  For example, if your lab processes

1,000 samples per day you would need 250,000 UIDs each year.  Assuming you would like to make sure that no ID is repeated over a 10 year period and you are using a strictly numeric identifier, your UID would need to be at least 7 digits which would allow you to encode up to 10 million (9,999,999 more precisely) samples uniquely.  Remember to build in a reasonable safety factor when choosing the length of your UID.  It is  always better to err on the side of too long when choosing a unique identifier.

2. Longevity of Samples

If samples are used temporarily for processing or disposed of after a short amount of time, the number of required UIDs will be dramatically less than if the samples have to be stored for many years. Using the same example of a lab that processes 1,000 samples a day, but now discards the samples after 30 days instead of storing the samples for 10 years, this lab could comfortably use a 5 digit identifier that would encode up to 100,000 samples and still have a 3X safety factor in case there was surge in business.

3.  Amount of Human Interaction

Processes that require human interaction, can benefit from choosing UIDs that are both rational (vs. random or serial) and alphanumeric (vs. purely numeric) that communicate information without having the user reference a database or list.

Purely numeric identifiers are more easily processed and managed by automated computer systems.  Alphanumeric identifiers with rational elements such as prefixes and suffixes corresponding to specific samples groups can allow human operators to categorize and process samples quickly by eye.  An example would be the vehicle identification number (VIN) of your car.

Here is an example of how to decode a VIN from Wikipedia

Remember that employing rational elements and constraining the character in a UID limits the available UIDs for a given UID length.   For example, constraining the first character of a UID to a letter reduces the available UIDs by 28% compared to allowing the first character to be either a letter or number.

4. Variety of Samples

The greater the variety of samples in your process, the more you may want to use rational elements and / or constrain characters in the UID to help categorize the samples for easy sorting and processing.  If you are fortunate enough to just have a single sample type, a serial numeric identifier will work cleanly without the need to create and maintain keys that decode the meaning to particular prefixes and suffixes.

5. Barcoding

As labs implement LIMS databases and look to reduce errors associated with manual entry of information, barcoding continues to grow in use.  While varying in degree, barcodes typically encode numeric UIDs much more efficiently than alphanumeric UIDs.

The table at the following link shows maximum numeric and alphanumeric barcoding capacity for Data Matrix barcodes.


Conclusions On Choosing a Unique Identifier (UID) for Your Lab

I’m sure there are many more factors unique to your lab that will affect the process of choosing a unique identifier code.  For some it will be simple math, while others will benefit from more complex codes and LIMS systems.

Get in touch