Testing Metrics for Password Creation Policies by Attacking Large Sets of Revealed Passwords, CCS, 10
Two conclusions: (1) Shannon entropy, suggested in NIST Electronic Authentication Guideline SP-800-63, is not the same as guessing entropy value and thus an ineffective metric for password security and (2) Most common password creation policies mandating minimum length, restriction on character alphabet, etc. remain vulnerable to online attack.
The authors establish the conclusions by analyzing the success rate of John the Ripper against real user passwords collected from different websites, the largest one containing over 32,000,000 passwords (RockYou password list).
The principal reason for Shannon entropy being different from guessing entropy is the skewed distribution of real world passwords, not every password is equally likely. The authors show this by a counter example; Shannon entropy to be equivalent to guessing entropy, the proportion of cracked passwords would have to increase linearly with the number of guesses made, but the experiments points to the contrary.
The principal reason for common password policies to remain vulnerable is a subset of the users pick easy to guess passwords that still comply with the password creation policy in place.
The feasibility of the online attacks depends on lockout policy (or rate-limiting policy), value of the target, user training, etc. The authors note that there are several kinds of password policies: explicit, implicit, and external. Explicit policies give the user a set of rules upfront, implicit policies accept or reject a password after the user has submitted one, and external policies improves entropy by controlling the password fully (propose a generated password) or partially (improve the entropy of a base password). One limitation of the experiments is the accounts for which the passwords were created are not corporate ones and also which password came from which policy was unknown.
The authors propose an implicit policy where a cracking algorithm learns how frequent people use certain words, how they mangle cases, the basic structure of passwords, the probability of digits and special symbols, etc. and based on the acquired information, the algorithm then constructs a probabilistic context-free grammar which models how people create passwords. When a user submits a password of his choosing, the algorithm computes how likely it is that the password came from the grammar and using a preselected threshold can either accept or reject it.
Citation (ACM Ref): Matt Weir, Sudhir Aggarwal, Michael Collins, and Henry Stern. 2010. Testing metrics for password creation policies by attacking large sets of revealed passwords. In Proceedings of the 17th ACM conference on Computer and communications security (CCS '10). ACM, New York, NY, USA, 162-175. DOI=10.1145/1866307.1866327 http://doi.acm.org/10.1145/1866307.1866327
The authors establish the conclusions by analyzing the success rate of John the Ripper against real user passwords collected from different websites, the largest one containing over 32,000,000 passwords (RockYou password list).
The principal reason for Shannon entropy being different from guessing entropy is the skewed distribution of real world passwords, not every password is equally likely. The authors show this by a counter example; Shannon entropy to be equivalent to guessing entropy, the proportion of cracked passwords would have to increase linearly with the number of guesses made, but the experiments points to the contrary.
The principal reason for common password policies to remain vulnerable is a subset of the users pick easy to guess passwords that still comply with the password creation policy in place.
The feasibility of the online attacks depends on lockout policy (or rate-limiting policy), value of the target, user training, etc. The authors note that there are several kinds of password policies: explicit, implicit, and external. Explicit policies give the user a set of rules upfront, implicit policies accept or reject a password after the user has submitted one, and external policies improves entropy by controlling the password fully (propose a generated password) or partially (improve the entropy of a base password). One limitation of the experiments is the accounts for which the passwords were created are not corporate ones and also which password came from which policy was unknown.
The authors propose an implicit policy where a cracking algorithm learns how frequent people use certain words, how they mangle cases, the basic structure of passwords, the probability of digits and special symbols, etc. and based on the acquired information, the algorithm then constructs a probabilistic context-free grammar which models how people create passwords. When a user submits a password of his choosing, the algorithm computes how likely it is that the password came from the grammar and using a preselected threshold can either accept or reject it.
Citation (ACM Ref): Matt Weir, Sudhir Aggarwal, Michael Collins, and Henry Stern. 2010. Testing metrics for password creation policies by attacking large sets of revealed passwords. In Proceedings of the 17th ACM conference on Computer and communications security (CCS '10). ACM, New York, NY, USA, 162-175. DOI=10.1145/1866307.1866327 http://doi.acm.org/10.1145/1866307.1866327
Encountering Stronger Password Requirements: User Attitudes and Behaviors, SOUPS, 10
At the start of 2010, CMU's password policy shifted from being very loose to strict. At least one character was the requirement, then it became at least 8 characters, one uppercase, one lowercase, one digit, and one symbol. In addition to those, if the string obtained after removing all non-alphabetic characters matched a dictionary word, the password would be rejected so would be those that contain four or more occurrences of the same character.
The authors conducted a paper-based survey among 470 CMU computer users by asking passersby on the CMU campus to complete surveys. Survey questions included: 4 on demographics, 8 on password handling, some questions on password composition, password storage and reuse, and user sentiment. Overall, the users were neutral about the policy change. According to users' self-report, on an average, it took about 1.77 tries to create password under the new (strict) policy and about 1.25 tries to login. 19% of the users forgot their new passwords and among them: 60% remembered it afterwards, 21% retrieved it from where they write it, and 11% went to the help desk. To forget passwords, faculty and staff were more likely than the students, women were more likely than men, those who changed password early were less likely, and IT experience and age did not have significant influence.
Participants were less likely to answer questions that reduce the entropy of their passwords more implying that they intuitively knew which aspects of their passwords are more important. The authors conclude that users find new requirements annoying but believe they provide better security, some users struggle to comply with the new requirements, users are more likely to share and reuse than to write down, users tend to modify old passwords to create new ones, over time the likelihood of sharing passwords increases, and use of dictionary words and names are the most common strategies to create passwords. Contrary to NIST's assumption: users create passwords with an average length greater than the minimum, users frequently used special characters, and more than two-thirds of the users used two or more numbers.
Citation (ACM Ref): Richard Shay, Saranga Komanduri, Patrick Gage Kelley, Pedro Giovanni Leon, Michelle L. Mazurek, Lujo Bauer, Nicolas Christin, and Lorrie Faith Cranor. 2010. Encountering stronger password requirements: user attitudes and behaviors. In Proceedings of the Sixth Symposium on Usable Privacy and Security (SOUPS '10). ACM, New York, NY, USA, , Article 2 , 20 pages. DOI=10.1145/1837110.1837113 http://doi.acm.org/10.1145/1837110.1837113
The authors conducted a paper-based survey among 470 CMU computer users by asking passersby on the CMU campus to complete surveys. Survey questions included: 4 on demographics, 8 on password handling, some questions on password composition, password storage and reuse, and user sentiment. Overall, the users were neutral about the policy change. According to users' self-report, on an average, it took about 1.77 tries to create password under the new (strict) policy and about 1.25 tries to login. 19% of the users forgot their new passwords and among them: 60% remembered it afterwards, 21% retrieved it from where they write it, and 11% went to the help desk. To forget passwords, faculty and staff were more likely than the students, women were more likely than men, those who changed password early were less likely, and IT experience and age did not have significant influence.
Participants were less likely to answer questions that reduce the entropy of their passwords more implying that they intuitively knew which aspects of their passwords are more important. The authors conclude that users find new requirements annoying but believe they provide better security, some users struggle to comply with the new requirements, users are more likely to share and reuse than to write down, users tend to modify old passwords to create new ones, over time the likelihood of sharing passwords increases, and use of dictionary words and names are the most common strategies to create passwords. Contrary to NIST's assumption: users create passwords with an average length greater than the minimum, users frequently used special characters, and more than two-thirds of the users used two or more numbers.
Citation (ACM Ref): Richard Shay, Saranga Komanduri, Patrick Gage Kelley, Pedro Giovanni Leon, Michelle L. Mazurek, Lujo Bauer, Nicolas Christin, and Lorrie Faith Cranor. 2010. Encountering stronger password requirements: user attitudes and behaviors. In Proceedings of the Sixth Symposium on Usable Privacy and Security (SOUPS '10). ACM, New York, NY, USA, , Article 2 , 20 pages. DOI=10.1145/1837110.1837113 http://doi.acm.org/10.1145/1837110.1837113
Of Passwords and People: Measuring the Effect of Password-Composition Policies, CHI, 11
The authors analyze four password-composition policies, basic8 (minimum 8 characters, no restriction), dic8 (check against a dictionary), comprehensive8 (min 8 + usual rules), and basic16 (min 16 without restrictions). More than 5000 people participated in the study via Amazon's Mechanical Turk. They had to create the password and after two days they had to login with that password.
Basic16 was a good tradeoff between usability and security, it had higher entropy than comprehensive8 but was easy to create and remember. Dictionaries do not increase entropy but using a comprehensive dictionary can make guessing attacks much harder. However, dictionary checking can be frustrating for the user. How likely that the user will write a password down is correlated to how much entropy in the password.
Contrary to myths, the authors found that adding numbers increase entropy significantly, dictionary check does not add much entropy, and users' base passwords usually exceed the minimum requirements.
Citation (ACM Ref): Saranga Komanduri, Richard Shay, Patrick Gage Kelley, Michelle L. Mazurek, Lujo Bauer, Nicolas Christin, Lorrie Faith Cranor, and Serge Egelman. 2011. Of passwords and people: measuring the effect of password-composition policies. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '11). ACM, New York, NY, USA, 2595-2604. DOI=10.1145/1978942.1979321 http://doi.acm.org/10.1145/1978942.1979321
Basic16 was a good tradeoff between usability and security, it had higher entropy than comprehensive8 but was easy to create and remember. Dictionaries do not increase entropy but using a comprehensive dictionary can make guessing attacks much harder. However, dictionary checking can be frustrating for the user. How likely that the user will write a password down is correlated to how much entropy in the password.
Contrary to myths, the authors found that adding numbers increase entropy significantly, dictionary check does not add much entropy, and users' base passwords usually exceed the minimum requirements.
Citation (ACM Ref): Saranga Komanduri, Richard Shay, Patrick Gage Kelley, Michelle L. Mazurek, Lujo Bauer, Nicolas Christin, Lorrie Faith Cranor, and Serge Egelman. 2011. Of passwords and people: measuring the effect of password-composition policies. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '11). ACM, New York, NY, USA, 2595-2604. DOI=10.1145/1978942.1979321 http://doi.acm.org/10.1145/1978942.1979321
PIN selection policies: Are they really effective?, C&S, 12
Anonymously collected PINs from 204,508 users of 'Big Brother Camera Security' reveal that PIN selection follows a power law with alpha = 2.25, implying that the occurrence frequency of the PINs decreases as a power function of their ranking. For example, the top 100 popular PINs account for about 29.3% of the total number of PINs, 50 PINs representing the years between 1951 and 2000 account for 5.5%.
Two measures of PIN closeness were investigated: the sum of the differences between all the consecutive numbers in the PIN and the sum of the physical distances between the consecutive digits in a PIN. An anonymous user study with 332 participants over one month (no longitudinal test for memorability, just response collection took one month) was conducted using 5 policies: 4 digit without restriction, 4 digits excluding 200 most popular, 4 digits excluding 3205 PINs which comprise of 200 most popular and physical closeness < 4, 6 digits without restriction, and 6 digits with closeness restriction. Remembrance difficulty was self-reported. Two measures of security was used: Shanon's entropy and Massey's guessing entropy. The authors found that the stricter the policy, generally, the more secure PINs become and the more the remembrance difficulty. The conclusion of the work is, PIN selection policy, if designed with caution, can be an easy solution to the highly skewed distribution of user-chosen PINs.
Citation: Hyoungshick Kim, Jun Ho Huh, PIN selection policies: Are they really effective?, Computers & Security, Volume 31, Issue 4, June 2012, Pages 484-496, ISSN 0167-4048, 10.1016/j.cose.2012.02.003.
Two measures of PIN closeness were investigated: the sum of the differences between all the consecutive numbers in the PIN and the sum of the physical distances between the consecutive digits in a PIN. An anonymous user study with 332 participants over one month (no longitudinal test for memorability, just response collection took one month) was conducted using 5 policies: 4 digit without restriction, 4 digits excluding 200 most popular, 4 digits excluding 3205 PINs which comprise of 200 most popular and physical closeness < 4, 6 digits without restriction, and 6 digits with closeness restriction. Remembrance difficulty was self-reported. Two measures of security was used: Shanon's entropy and Massey's guessing entropy. The authors found that the stricter the policy, generally, the more secure PINs become and the more the remembrance difficulty. The conclusion of the work is, PIN selection policy, if designed with caution, can be an easy solution to the highly skewed distribution of user-chosen PINs.
Citation: Hyoungshick Kim, Jun Ho Huh, PIN selection policies: Are they really effective?, Computers & Security, Volume 31, Issue 4, June 2012, Pages 484-496, ISSN 0167-4048, 10.1016/j.cose.2012.02.003.