StringTools - Maple Programming Help

Online Help

All Products    Maple    MapleSim


Home : Support : Online Help : Programming : Names and Strings : StringTools Package : Miscellaneous Utilities : StringTools/CharacterFrequencies

StringTools

  

CharacterFrequencies

  

compute the number of occurrences of each character in a string

 

Calling Sequence

Parameters

Description

Examples

Calling Sequence

CharacterFrequencies( s, filter )

Parameters

s

-

Maple string

filter

-

(optional) name or string; character class filter specifying frequencies returned

Description

• 

The CharacterFrequencies(s) command returns an expression sequence of equations of the form character = frequency, where character is a single character string, and frequency is the number of times the corresponding character occurs in the string s.

• 

The expression CharacterFrequencies( s ) is equivalent to seq( ch = CountCharacterOccurrences( s, ch ), ch = Support( s ) ), but computes the latter result more efficiently.

• 

The frequencies appear in ASCII order; that is, in order of the numeric byte value of the character on the left-hand side of each equation. For an example illustrating how to sort by frequency, see the examples below.

• 

To specify that the frequencies of only certain characters be returned, use an optional character class filter parameter. The parameter can be a string of characters to return, for example, "abcd" or one of the following character class names.

alpha

alphabetic characters

alnum

alphabetic characters and digits

ascii

ASCII (7-bit) characters

binary

"0" and "1"

cntrl

control characters

digit

decimal digits

dna

A,C,G or T

hdigit

hexadecimal digits (both cases)

ident

identifier characters

ident1

leading identifier characters

lower

lowercase letters

odigit

octal digits (0-7)

space

whitespace characters

upper

uppercase letters

vowel

vowels (both cases)

 

 

• 

All of the StringTools package commands treat strings as (null-terminated) sequences of 8-bit (ASCII) characters.  Thus, there is no support for multibyte character encodings, such as unicode encodings.

Examples

withStringTools:

CharacterFrequencies

CharacterFrequenciesaaaa

a=4

(1)

CharacterFrequenciesabcadaeb

a=3,b=2,c=1,d=1,e=1

(2)

CharacterFrequenciesabracadabra

a=5,b=2,c=1,d=1,r=2

(3)

CharacterFrequenciesRandom1000000,lower

a=38715,b=38351,c=38615,d=38449,e=38194,f=38657,g=38420,h=38624,i=37981,j=38519,k=38495,l=38382,m=38407,n=38522,o=38175,p=38798,q=38050,r=38676,s=38658,t=38402,u=38559,v=38293,w=38421,x=38494,y=38644,z=38499

(4)

CharacterFrequenciesRandom1000000,dna

A=251057,C=249668,G=249658,T=249617

(5)

CharacterFrequenciesRandom1000000,binary

0=499805,1=500195

(6)

ShakespeareWhen in disgrace with Fortune and men's eyes,\nI all alone beweep my outcast state,\nAnd trouble deaf heaven with my bootless cries,\nAnd look upon my self and curse my fate,\nWishing me like to one more rich in hope,\nFeatured like him, like him with friends possessed,\nDesiring this man's art, and that man's scope,\nWith what I most enjoy contented least,\nYet in these thoughts my self almost despising,\nHaply I think on thee, and then my state,\n(Like to the lark at break of day arising\nFrom sullen earth) sings hymns at heaven's gate,\n For thy sweet love remembered such wealth brings,\n That then I scorn to change my state with kings.:

cfCharacterFrequenciesShakespeare

cf =13, =106,'=4,(=1,)=1,,=15,.=1,A=2,D=1,F=4,H=1,I=4,L=1,T=1,W=3,Y=1,a=35,b=6,c=10,d=15,e=66,f=6,g=11,h=31,i=31,j=1,k=9,l=19,m=20,n=38,o=28,p=7,r=21,s=42,t=48,u=9,v=3,w=8,y=13

(7)

We can sort the results by frequency, as follows.

sortcf,u,vevalbrhsurhsv

(=1,)=1,.=1,D=1,H=1,L=1,T=1,Y=1,j=1,A=2,W=3,v=3,'=4,F=4,I=4,b=6,f=6,p=7,w=8,k=9,u=9,c=10,g=11, =13,y=13,,=15,d=15,l=19,m=20,r=21,o=28,h=31,i=31,a=35,n=38,s=42,t=48,e=66, =106

(8)

Use a filter to restrict attention to a limited class of characters.

CharacterFrequenciesShakespeare,ABCYZ

A=2,Y=1

(9)

CharacterFrequenciesShakespeare,upper

A=2,D=1,F=4,H=1,I=4,L=1,T=1,W=3,Y=1

(10)

CharacterFrequenciesRandom1000000,lower

a=3972,b=3990,c=3839,d=3885,e=3851,f=3949,g=3921,h=4017,i=3926,j=4025,k=4089,l=3815,m=3881,n=3962,o=3950,p=3801,q=3910,r=3877,s=3789,t=3820,u=3918,v=4004,w=3948,x=3851,y=3855,z=3809

(11)

See Also

character classes

sort

string

StringTools

StringTools[CountCharacterOccurrences]

StringTools[Random]

with