Message length
Message length is dictated by the number of bytes of the message text. A single payload can only be 140 bytes. If a message text is longer than 140 bytes, then it becomes a concatenated or multi-part message. In a multi-part message, some of the allocated bytes from the payload are used to create a user data header (UDH). This provides identification and ordering information so that the receiving device knows how to order the separate payloads, which may arrive at the handset out of order into a single readable message. The UDH takes 6 bytes or 48 bits. This reduces the space for how many characters can be in each message part.
SMS primarily supports two character sets: GSM-7 and UCS-2/UTF-16*.
Although Campaign can send up to a maximum of 800 GSM characters or 400 UCS-2 characters in six parts, some markets might limit the maximum number of concatenated messages. Certain markets may have specific requirements or limitations. Check with the governing bodies of the market to obtain further information.
GSM character sets
The way you create your SMS messages impacts the way your messages are sent and received by your customers. Understanding a few key concepts and incorporating best practices will ensure your messages are delivered the way you expect and reduce the risk of unexpected costs.
Depending on which alphabet or character set you use, SMS messages typically contain a maximum of 160 7-bit characters or 70 2-byte characters. If the allowed character count is exceeded, the SMS is split into multiple messages and additional costs are assessed accordingly.
GSM-7
GSM-7 character set supports most, but not all, characters for languages that use the Latin-based alphabet, such as English, Spanish, French. The GSM character encoding uses seven bits to represent each character similar to ASCII. One SMS message that uses GSM can contain a maximum of 160 characters.
GSM characters
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 : ; < = > ? ¡ ¿ ! " # ¤ % & ( ) ' * + , - . / Ä Ö Ñ Ü § ä ö ñ ü à @ £ $ ¥ è é ù ì ò Ç Ø ø Å å Δ _ Φ Γ Λ Ω Π Ψ Σ Θ Ξ Æ æ ß É
The following characters are also supported by GSM 7 but these cost 2 GSM 7 Characters:
€ ^ { } [ ] ~ |
GSM-7 messages break down as follows:
- Standard single SMS messages: For GSM phones with 7-bit character encoding, a standard SMS message can contain a maximum of 160 characters. That is 1120 bits / (7 bits/character) = 160 characters for a single SMS message.
- GSM-7 multi-part or concatenated messages: When the message text is longer than 160 GSM characters, the message is concatenated and sent. When a message is concatenated, the user data header (UDH) consumes 6 bytes or 48 bits. This reduces the maximum number of characters in each message part:
1120 bits - 48 bits = 1072 bits
1072 bits / (7 bits/character) = 153 characters per message part.
Note: 153 characters * (7 bits/character) = 1071 bits. However, the extra bit cannot be used to represent a full character. It is added as padding so that the actual 7-bit encoding data begins on a septet boundary—the 50th bit.
UCS-2/UTF-16
This character set is used for non-Latin based alphabet languages such as Arabic, Chinese, Cyrillic, and so on. As the characters for these languages are supported within Unicode, 16 bits per character is used instead of 7 bits per character.
The two types of UCS-2/UTF-16 messages are:
- Standard single SMS message (Unicode): For Unicode phones with 16-bit character encoding, a standard SMS message can contain up to 70 characters. That is 1120 bits / (16 bits/character) = 70 characters.
- UCS-2/UTF-16 multi-part or concatenated messages: When the message text is longer than 70 UCS-2 characters, the message is concatenated and sent. When a message is concatenated, the user data header (UDH) consumes 6 bytes or 48 bits. This reduces the maximum number of characters in each message part:
1120 bits - 48 bits = 1072 bits
1072 bits / (16 bits/character) = 67 characters per message part
Note: Campaign uses the best character set depending on what characters are detected within the message. When a character that is not supported by GSM-7 is detected, the entire message is converted to Unicode. Therefore every character present uses the UTF-16 equivalent. SMS messages can only ever be in one particular encoding. Either fully GSM-7 or fully UTF-16.
UTF-8
Acoustic Campaign does support and transmit SMS message as UTF-8, but how the message is processed depends on the capabilities of the mobile phones, mobile carriers, and local SMS gateways or vendors.
Though UTF-8 is a common Unicode character set that supports characters for many languages. it is not widely supported across the SMS technology field. Most mobile phones are configured to support GSM-7 or UCS-2 (UTF-16) or both. Even if a few phones do support UTF-8, most mobile carriers do not support UTF-8 as they favor the much more robust UTF-16 format. The local SMS gateway (the vendor or the aggregator) also must support UTF-8. Many do support UTF-8 but may not support it in all the markets.
In the instance where UTF-8 is fully supported, an SMS message breaks down as follows:
- Standard single SMS message (Unicode): For Unicode phones with 8-bit character encoding, a standard SMS message can contain a maximum of 140 characters. That is 1120 bits / (8 bits/character) = 140 characters.
- UTF-8 multi-part or concatenated messages: When the message text is longer than 140 UTF-8 characters, the message is concatenated and sent. When a message is concatenated, the user data header (UDH) consumes 6 bytes or 48 bits. This reduces the maximum number of characters in each message part:
1120 bits - 48 bits = 1072 bits
1072 bits / (8 bits/character) = 134 characters per message part
Message type |
Max character count for GSM |
Max character count for UTF-16 |
Max character count for UTF-8 |
Standard single SMS message |
160 |
70 |
140 |
Two concatenated SMS messages |
306 |
134 |
268 |
Three concatenated SMS messages |
459 |
201 |
402 |
Four concatenated SMS messages |
612 |
268 |
536 |
Five concatenated SMS messages |
765 |
335 |
670 |
Six concatenated SMS messages |
800 |
400 |
|
Maximum character count for supported encoding systems |
Character swap
Character swap replaces one character of an SMS message with another when that character is not of the GSM standard. For example, the character "ë" is replaced by the GSM character "e" when your message is sent. By default, character swap is not active. To allow character swap, select the appropriate checkbox when creating your SMS program.
If you choose to not use character swap, any characters sent in binary format (Unicode) are sent as they are. However, SMS messages with Unicode characters are limited to 70 characters. If the maximum number of characters is exceeded, the message is broken into multiple messages that may cause additional costs.
Also, keep in mind that inserting personalization fields in your SMS messages can add non-GSM encoded characters that can then cause additional message costs.
Add emojis and symbols to the SMS message body
You can add emojis and symbols as part of your SMS text (both in SMS Draft and Automated Message) and Acoustic will do its best to ensure that the message containing these are delivered to your mobile carrier. To date, emojis and symbols have not been standardized across carriers.
Using emojis will limit your message to 70 characters as it requires Unicode encoding. If your message exceeds 70 characters and requires multiple parts, the number of available characters in each part is reduced to 67 to allow for the User Defined Header. It is used to concatenate the message so it displays as a single message (if supported by the carrier and the device).
- If you are using Mac iOS, the easiest way to add emojis to an SMS message is to click Edit from your web browser menu and select Emoji & Symbols.
- If you’re using Windows, you can install the Emoji Keyboard extension for Firefox or Chrome browsers.
- Alternatively, you can copy and paste the glyph directly into the message or enter the HTML code.
Finally, verify on the Confirm and send page that your symbols and emojis display as expected.
Best practices
- Be aware that if you use a single character outside the GSM character set, your message will be split into multiple messages if you exceed 70 characters.
- If you use Microsoft Office to compose your messages, turn off the “Smart quotes” feature. This feature converts apostrophes and quotation marks to characters not supported in the GSM character set. Straight quotes and apostrophes are supported in GSM but curly quotes and apostrophes are not.
- Select Latin American countries support Latin1 or ASCII only, which limits message length to 140 characters.