Expression syntax guidelines
Expressions are the key components of Endpoint Data Discovery (EDD) rules and are used during an EDD scan to detect specific file content.
An expression may be a single word, such as "confidential", or it may include a combination of words, variables, special characters, and operators, such as:
(Confidential W/1 (Data OR Image OR Document))
This expression will detect any of the following phrases included in a file:
- confidential data
- Confidential image
- confidential Document
To get started creating your own custom EDD rules, you first need to know how to use special characters and operators in your expressions, and understand the general guidelines and limitations that apply.

EDD rule expressions support the following special characters:
Character |
Example |
Description |
---|---|---|
? |
???==== |
Match any single character [alphanumeric, - (dash), or _ (underscore)] When the example text is used in an expression, a match is detected if any 3 characters are followed by any 4 digits, such as D4B5321. |
= |
===-==-==== |
Match any single digit [numeric only] When the example text is used in an expression, a match is detected if any 9 digits are found in a file in the defined format, such as 123-45-6789. |
* |
in* |
Match zero or more consecutive characters [alphanumeric, - (dash), and _ (underscore)] When the example text is used in an expression, a match is detected if the characters "in" are followed by zero or more characters, such as "in", "interest", or "in543". |
~ |
borrowing~ |
Match all words that contain the root form of a word. When the example text is used in an expression, a match is detected if a word contains the root "borrow", such as "borrow", "borrowed", "borrows", or "borrowing". |

EDD rule expressions support the following operators:
Operator |
Example |
Description |
---|---|---|
( ) |
Patient W/1 (ID OR record OR number) W/1??===== |
Sets precedence and defines grouping |
W/N |
Interest W/2 (Rate OR Income) |
Words or phases occur within N words of each other When the example text is used in an expression, a match is detected if "Interest" is within 2 words of "Rate" or "Income" (either before or after). For example, the phrase "rate of interest" would generate a match. W/1 means the words are adjacent. |
W/0 |
(=== === === NOT W/0 (0== === === OR 8== === === OR 9== === ===)) |
Words or phrases occur within 0 words of each other "W/0" essentially means "this word" and is often used in conjunction with the NOT operator. When the example text is used in an expression, a match is detected if a string contains 9 digits in the defined format, but not if it begins with 0, 8 or 9. |
NOT |
(=== == ==== NOT W/0 000 == ====) |
Negates the results of an expression You can use the NOT operator in conjunction with the W/N operator. When the example text is used in an expression, a match is detected if a string contains 9 digits in the defined format, but not if it begins with 000. |
AND |
Account AND (Number OR Statement) |
To generate a match, all values must be matched. When the example text is used in an expression, a match is detected if either of the phrases "Account Number" or "Account Statement" are found in a file. |
OR |
(Number OR Nr OR Num) W/1 ======= |
To generate a match, one value must be matched. When the example text is used in an expression, if the phrase "Nr 5546372" is found in a file, a match is detected. |
@luhn (==== ==== ==== ====) |
To generate a match, the string must contain a valid Luhn algorithm Also known as the modulus 10 or mod 10 algorithm. A simple checksum formula used to validate identification numbers, such as credit card numbers. check digit. This operator is used in the Credit Card template. When the example text is used in an expression, a match is detected if a 16 digit number, in the defined format, contains a valid check digit. |
|
@Date_Any_Alpha (<locale parameter>) |
@Date_Any_Alpha (en_US) |
To generate a match, the string must contain a valid alphanumeric date supported by the locale parameter. Alphanumeric dates use month names, such as September 7, 2018, or month abbreviations, such as 15 Aug 2018. When the example text is used in an expression, a match is detected if an alphanumeric date is found in a file in United States locale format. |
@Date_Any_Numeric (all) |
@Date_Any_Numeric (all) |
To generate a match, the string must contain a valid numeric date in any supported numeric date format. Any of the following separators are supported:
NOTE Use of the "all" locale parameter may result in a large number of false positives A result on an EDD-related report or page in which a match is detected in a file, but upon further investigation, you do not consider the matched content to be at-risk data. . If the locale is known, use its applicable parameter instead. |
@Date_Any_Numeric (<locale parameter>) |
@Date_Any_Numeric (en_US) |
To generate a match, the string must contain a valid numeric date in the format supported by the locale parameter. When the example text is used in an expression, a match is detected if a date in one of the following formats is found in a file:
|
@Date_Specific_Numeric (all; YYYY; MM; DD) |
@Date_Specific_Numeric (all; 2019; 01; 15) |
To generate a match, the string must contain the date set in the operator in any supported numeric date format. When the example text is used in an expression, a match is detected if the date January 15, 2019 is found in a file in any supported numeric date format. |
@Date_Specific_Numeric (<locale parameter>; YYYY; MM; DD) |
@Date_Specific_Numeric (en_CA; 2012; 12; 31) |
To generate a match, the string must contain a valid numeric date supported by the locale parameter and matching the date set within the operator. When the example text is used in an expression, a match is detected if the date December 31st, 2012 is found in a file in Canadian locale format. |
@mask_after( 5; (===-==-====) |
Mask all characters after the <n>th character in a matched "word". This operator is used in many of the templates. When the example text is used in an expression, all characters after the 5th character are masked (replaced with asterisks), such as 785-6******. |
|
@mask_upto( 4; (==== ==== ==== ====)) |
Mask all characters in a matched "word", except the last <n> characters. When the example text is used in an expression, all characters up to the last four characters are masked (replaced with asterisks), such as **** **** **** 1747. |
NOTE Due to the role of NOT, AND, and OR as operators, you can't use them as "words" in an expression.
More expression examples:
Expression |
Definition |
Examples of detected content |
---|---|---|
Asthma |
Detects the word "Asthma" regardless of type case. |
Asthma, asthma, ASTHMA |
(West Nile W/3 (fever OR virus)) |
Detects the words "west nile" within 3 words of the words "fever" or "virus". |
"fever caused by west nile" |
(Student OR Student ID OR Student Identification Number OR Student Identification) W/1 ???====* |
Detects any of the words [Student, Student ID, Student Identification Number, Student Identification] within one word of an identifier in the format [(3 characters) + (5 digits) + (any non-whitespace characters)]. |
Student HGF54793TY8A, Student ID xyz45976 |

EDD rule expressions support the following country specific operators:
Operator |
Example |
Description |
---|---|---|
(Japan) |
@jpidnumber (====-====-====) |
This operator detects valid Japanese My Numbers. It is used in the My Number (Japan) template. When the example text is used in an expression, a match is detected if a 12 digit number, in the defined format, is a valid Japanese My Number. |
@US_MBI (United States) |
US_MBI (=??=-??=-??== OR =??=??=??==) |
This operator detects valid US Medicare Beneficiary Identifiers. It is included in the General ID Samples template. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid US Medicare Beneficiary Identifier. |
The operators in this section are used by the predefined GDPR Personal Data rule to find personal identifiers. They are also included in the GDPR Personal Data template. |
||
@AT_TIN (Austria) |
@AT_TIN (== ======= OR === ======= OR ========== ) |
This operator detects valid Austrian Tax Identification Numbers. When the example text is used in an expression, a match is detected if a 9 or 10 digit number, in the defined format, is a valid Austrian Tax Identification Number. |
@BE_NN (Belgium) |
@BE_NN (== == ==-=== == OR ===========) |
This operator detects valid Belgian National Numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Belgian National Number. |
@BG_UCN (Bulgaria) |
@BG_UCN (==========) |
This operator detects valid Bulgarian National IDs. When the example text is used in an expression, a match is detected if a 10 digit number, in the defined format, is a valid Bulgarian National ID. |
@CY_TIC (Cyprus) |
@CY_TIC (========?) |
This operator detects valid Cypriot Tax ID Codes. When the example text is used in an expression, a match is detected if a 9 character alphanumeric value, in the defined format, is a valid Cypriot Tax ID Code. |
@CZ_SK_BN (Czech Republic and Slovakia) |
@CZ_SK_BN (====== ==== OR ====== === OR ========== OR =========) |
This operator detects valid Czech or Slovakian Birth Numbers. When the example text is used in an expression, a match is detected if a 10 digit number, in the defined format, is a valid Czech or Slovakian National Number. |
@DE_IDNR (Germany) |
@DE_IDNR (=========== OR == === === ===) |
This operator detects valid German Taxpayer IDs. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid German Taxpayer ID. |
@DE_SSN (Germany) |
@DE_SSN (========?=== OR == ====== ? ===) |
This operator detects valid German Social Security Numbers. When the example text is used in an expression, a match is detected if a 12 character alphanumeric value, in the defined format, is a valid German Social Security Number. |
@DK_CPR (Denmark) |
@DK_CPR (======-==== OR ==========) |
This operator detects valid Danish Civil Registration System numbers. When the example text is used in an expression, a match is detected if a 10 digit number, in the defined format, is a valid Danish Civil Registration System number. |
@EE_IK (Estonia) |
@EE_IK (===========) |
This operator detects valid Estonian Personal ID Numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Estonian Personal ID Number. |
@ES_DNI (Spain) |
@ES_DNI (========-?) |
This operator detects valid Spanish DNI (Documento Nacional de Identidad) numbers. When the example text is used in an expression, a match is detected if a 9 character alphanumeric value, in the defined format, is a valid Spanish Documento Nacional de Identidad. |
@FI_HETU (Finland) |
@FI_HETU (======-===? OR ======A===?) |
This operator detects valid Finnish Personal ID Numbers. When the example text is used in an expression, a match is detected if an 11 character alphanumeric value, in the defined format, is a valid Finnish Personal ID Number. |
@FR_NIR (France) |
@FR_NIR (======?======== OR = == == =? === === ==) |
This operator detects valid French Social Security Numbers. When the example text is used in an expression, a match is detected if a 15 character alphanumeric value, in the defined format, is a valid French Social Security Number. |
@GR_AFM (Greece) |
@GR_AFM (=========) |
This operator detects valid Greek Tax Identifier numbers. When the example text is used in an expression, a match is detected if a 9 digit number, in the defined format, is a valid Greek Tax Identifier number. |
@GR_AMKA (Greece) |
@GR_AMKA (===========) |
This operator detects valid Greek Social Security Numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Greek Social Security Number. |
@GR_TAUTOTITA (Greece) |
@GR_TAUTOTITA (?-====== OR ? ======) |
This operator detects valid Greek Personal Identification Numbers. When the example text is used in an expression, a match is detected if a 7 character alphanumeric value, in the defined format, is a valid Greek Personal Identification Number. |
@HR_MBG (Croatia) |
@HR_MBG (===========) |
This operator detects valid Croatian Matični broj građana (MBG). When the example text is used in an expression, a match is detected if a 13 digit number, in the defined format, is a valid Croatian MBG. |
@HR_OIB (Croatia) |
@HR_OIB (===========) |
This operator detects valid Croatian Osobni identifikacijski broj (OIB). When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Croatian OIB. |
@HU_PID (Hungary) |
@HU_PID (======??) |
This operator detects valid Hungarian Personal IDs. When the example text is used in an expression, a match is detected if a 8 character alphanumeric value, in the defined format, is a valid Hungarian Personal ID. |
@HU_PIN (Hungary) |
@HU_PIN (===========) |
This operator detects valid Hungarian Personal Identification Numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Hungarian Personal Identification Number. |
@HU_TAJ (Hungary) |
@HU_TAJ (=== === === OR ===-===-=== OR =========) |
This operator detects valid Hungarian Social Security Identification Numbers (TAJ). When the example text is used in an expression, a match is detected if a 9 digit number, in the defined format, is a valid Hungarian Social Security Identification Number. |
@IE_PIN (Ireland) |
@IE_PIN (=======? OR =======??) |
This operator detects valid Irish Public Service Numbers. When the example text is used in an expression, a match is detected if an 8 or 9 character alphanumeric value, in the defined format, is a valid Irish Public Service Number. |
@IS_KT (Iceland) |
@IS_KT (========== OR ======-====) |
This operator detects valid Icelandic Kennitala. When the example text is used in an expression, a match is detected if a 10 digit number, in the defined format, is a valid Icelandic Kennitala. |
@IT_CF (Italy) |
@IT_CF (??????==?==?===?) |
This operator detects valid Italian Tax Code IDs. When the example text is used in an expression, a match is detected if a 16 character alphanumeric value, in the defined format, is a valid Italian Tax Code ID. |
@LI_PIN (Liechtenstein) |
@LI_PIN (============) |
This operator detects valid Liechtensteiner Identity Cards. When the example text is used in an expression, a match is detected if a 12 digit number, in the defined format, is a valid Liechtensteiner Identity Card. |
@LT_AK (Lithuania) |
@LT_AK (===========) |
This operator detects valid Lithuanian Personal Codes (Asmens Kodas). When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Lithuanian Personal Code. |
@LU_TIN (Luxembourg) |
@LU_TIN (==== == == === == OR =============) |
This operator detects valid Luxembourgish Personal ID Numbers. When the example text is used in an expression, a match is detected if an 13 digit number, in the defined format, is a valid Luxembourgish Personal ID Number. |
@LV_PK (Latvia) |
@LV_PK (======-=====) |
This operator detects valid Latvian Code numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Latvian Code number. |
@MT_KTI (Malta) |
@MT_KTI (=======?) |
This operator detects valid Maltese Identity Card Numbers. When the example text is used in an expression, a match is detected if an 8 character alphanumeric value, in the defined format, is a valid Maltese Identity Card Number. |
@NL_BSN (Netherlands) |
@NL_BSN (========= OR ==== == ===) |
This operator detects valid Dutch Citizen Service Numbers. When the example text is used in an expression, a match is detected if a 9 digit number, in the defined format, is a valid Dutch Citizen Service Number. |
@NO_FNR (Norway) |
@NO_FNR (===========) |
This operator detects valid Norwegian Personal ID Numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Norwegian Personal ID Number. |
@PL_PESEL (Poland) |
@PL_PESEL (===========) |
This operator detects valid Polish Personal Numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Polish Personal Number. |
@PT_NIC (Portugal) |
@PT_NIC (=========??= OR ========-=??= OR ======== =??=) |
This operator detects valid Portuguese Civil Identification Numbers. When the example text is used in an expression, a match is detected if a 12 character alphanumeric value, in the defined format, is a valid Portuguese Civil Identification Number. |
@PT_NIF (Portugal) |
@PT_NIF (=== === === OR ===-===-=== OR =========) |
This operator detects valid Portuguese Tax Identification Numbers. When the example text is used in an expression, a match is detected if a 9 digit number, in the defined format, is a valid Portuguese Tax Identification Number. |
@PT_NISS (Portugal) |
@PT_NISS (===========) |
This operator detects valid Portuguese Social Security Numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Portuguese Social Security Number. |
@RO_CNP (Romania) |
@RO_CNP (=============) |
This operator detects valid Romanian Personal Numerical Codes. When the example text is used in an expression, a match is detected if a 13 digit number, in the defined format, is a valid Romanian Personal Numerical Code. |
@SE_PIN (Sweden) |
@SE_PIN (======-==== OR ======== ==== OR ============) |
This operator detects valid Swedish Personal ID Numbers. When the example text is used in an expression, a match is detected if a 12 digit number, in the defined format, is a valid Swedish Personal ID Number. |
@SI_EMSO (Slovenia) |
@SI_EMSO (=============) |
This operator detects valid Slovenian Master Citizen Numbers. When the example text is used in an expression, a match is detected if a 13 digit number, in the defined format, is a valid Slovenian Master Citizen Number. |
@SK_COP (Slovakia) |
@SK_COP (??====== OR ?? ======) |
This operator detects valid Slovakian Citizen ID Numbers. When the example text is used in an expression, a match is detected if an 8 character alphanumeric value, in the defined format, is a valid Slovakian Citizen ID Number. |
@UK_NHS |
@UK_NHS (=== === ====) |
This operator detects valid UK National Health Service numbers. When the example text is used in an expression, a match is detected if a 10 digit number, in the defined format, is a valid National Health Service number. |
@UK_NINO |
@UK_NINO (??======?) |
This operator detects valid UK National Insurance Numbers. When the example text is used in an expression, a match is detected if a 9 digit number, in the defined format, is a valid UK National Insurance Number. |

The following guidelines apply to expressions and expression sets:
- Expressions are not case sensitive.
For example the expression "confidential w/1 Data" will yield the same results as the expression, "CONFIDENTIAL W/1 data".
- Expressions can contain any number of nodes. Also see Node limitations.
- Expressions can include the following special characters:
- Dash (-)
- Underscore (_)
- Supported special characters
You will not receive a syntax error if you use an unsupported special character, but file content is not matched as expected. Therefore, to successfully match file content that contains an unsupported special character (for example, "A&W"), replace the special character with a space (for example, "A W"). When you view Matched Tokens after an EDD scan is run, a match to "A&W" shows as "A,W".
- An expression set can contain an unlimited number of expressions, but it must contain at least one expression. If you delete all expressions from an expression set, delete the expression set.
- The same string element (word) can't be used more than 64 times in an expression set.
- A word is limited to 255 bytes in length (some UTF8 characters are 4 bytes in length).
- The wildcard character (*) can't be used more than once in a word.
- Node limitations:
- A node is defined as a text string that is separated from the rest of the expression by an operator. If the expression contains a phrase with no operators, then the phrase is a node.
- A node is limited to 512 bytes in length.
- A node can't include more than eight (8) words. This includes words that use any of the following special characters: ?, =, *
However, if your rule fails to adhere to any of the following limitations you will need to correct each error before you can publish the rule:
For more help creating expressions using the syntax supported in Endpoint Data Discovery Rules, contact Absolute Technical Support.