New for AppleScript in Mac OS X v10.5

AppleScript 2.0 in Mac OS X Leopard is a significant release with important updates and modifications, including: full Unicode support, new intrinsic application properties and constructs, new scriptable system preferences, and much more.

Unicode Support

AppleScript is now entirely Unicode-based. Comments and text constants in scripts may contain any Unicode characters, and all text processing is done in Unicode, so all characters are preserved correctly regardless of the user’s language preferences. For example, this script works correctly in AppleScript 2.0, where it would not have in previous versions:

set the Japanese_phrase to "日本語"
set the Russian_phrase to "Русский"
set the new_phrase to the Japanese_phrase & " and " & the Russian_phrase
--> returns: "日本語 and Русский"

There is no longer a distinction between Unicode and non-Unicode text. There is exactly one text class, named “text”: that is, class of "How now brown cow" returns text. It is functionally equivalent to the former Unicode text class, so it may contain any Unicode character, and has two new features:

  • Text objects have an id property, which may also be used as an address.

    These allow mapping between Unicode code point values and the characters at those code points: for example, id of "A" returns 65, and character id 65 returns "A". The id of text longer than one code point is a list of integers, and vice versa: for example, id of "hello" returns {104, 101, 108, 108, 111}, and string id {104, 101, 108, 108, 111} returns "hello". (Because of a bug, text id ... does not work; you must use one of the synonymous class names.) These obsolete the older ASCII character and ASCII number commands, since, unlike those, they cover the full Unicode character range and will return the same results regardless of the user's language preferences.
     
  • Character elements of text count a combining character sequence as a single character.

    This relates to a feature of Unicode: some “characters” may be represented as either a single entity or as a base character plus a series of combining marks. For example, “é” may be encoded as either U+00E9 (LATIN SMALL LETTER E WITH ACUTE) or as U+0065 (LATIN SMALL LETTER E), U+0301 (COMBINING ACUTE ACCENT). Nonetheless, AppleScript 2.0 will count both as one character, where older versions counted the base character and combining mark separately.

Compatibility

For compatibility with pre-2.0 AppleScript, string and Unicode text are still defined, but are considered synonyms for text. For example, all three of these statements have the same effect:

some_verbage as text
some_verbage as string
some_verbage as Unicode text

In addition, text, string, and Unicode text will all compare as equal. For example, class of "how now brown cow" is string is true, even though class of "how now brown cow" returns text. It is still possible for applications to distinguish between the three different types, even though AppleScript itself does not.

Now that AppleScript preserves all characters correctly worldwide, it is also stricter about the text used in scripts. AppleScript syntax uses several non-ASCII characters, such as “≠” and “¬”. These characters must be typed exactly as the AppleScript Language Guide for AppleScript 1.3.7 describes. For compatibility with Asian national encodings, “《” and “》” are allowed as synonyms for “«” and “»”, since the latter do not exist in some Asian encodings.

Because AppleScript 2.0 scripts store all text as Unicode, any text constants count as a use of the former Unicode text class, which will work with any version of AppleScript back to version 1.3. A script that contains Unicode-only characters such as Arabic or Thai will run, but will not be correctly editable in any version of Mac OS or Mac OS X older than Mac OS X 10.5: the Unicode-only characters will be lost.

Use of the new id property requires AppleScript 2.0.