用pystring处理字符串

    c++中字符串处理总是个烦人的地方,虽说std::string也还不错,但是比起pythong,lua这些总是麻烦很多。

今天发现pystring这个函数包装库,确实是很不错。它通过包装std::string实现了类似python处理string的接口。

 

链接在这里:https://code.google.com/p/pystring/  用法很简单,这里翻译了下它的文档:

capitalize

std::string capitalize(const std::string & str)

Return a copy of the string with only its first character capitalized.

首字母大写

center

std::string center(const std::string & str, int width)

Return centered in a string of length width. Padding is done using spaces.

用空格前后补齐一个字符串,让str的内容居中。比如:

center(“yes”, 10);  =》 “    yes    ”

count

int count( const std::string & str, const std::string & substr, int start = 0, int end = MAX_32BIT_INT)

Return the number of occurrences of substring sub in string Sstart:end. Optional arguments start and end are interpreted as in slice notation.

子字符串出现的次数

endswith

bool endswith( const std::string & str, const std::string & suffix, int start = 0, int end = MAX_32BIT_INT)

Return True if the string ends with the specified suffix, otherwise return False. With optional start, test beginning at that position. With optional end, stop comparing at that position.

结尾是否是suffix指定的字符串

expandtabs

std::string expandtabs( const std::string & str, int tabsize = 8)

Return a copy of the string where all tab characters are expanded using spaces. If tabsize is not given, a tab size of 8 characters is assumed.

将tab变成空格,这个好,我讨厌tab :)

find

int find( const std::string & str, const std::string & sub, int start = 0, int end = MAX_32BIT_INT )

Return the lowest index in the string where substring sub is found, such that sub is contained in the range [start, end). Optional arguments start and end are interpreted as in slice notation. Return -1 if sub is not found.

查找子字符串

index

int index( const std::string & str, const std::string & sub, int start = 0, int end = MAX_32BIT_INT )

Synonym of find right now. Python version throws exceptions. This one currently doesn't.

和find相同

isalnum

bool isalnum( const std::string & str )

Return true if all characters in the string are alphanumeric and there is at least one character, false otherwise.

判断是否是数字或是字母

isalpha

bool isalpha( const std::string & str )

Return true if all characters in the string are alphabetic and there is at least one character, false otherwise

是否都是字母

isdigit

bool isdigit( const std::string & str )

Return true if all characters in the string are digits and there is at least one character, false otherwise.

是否是数字

islower

bool islower( const std::string & str )

Return true if all cased characters in the string are lowercase and there is at least one cased character, false otherwise.

isspace

bool isspace( const std::string & str )

Return true if there are only whitespace characters in the string and there is at least one character, false otherwise.

是否只是空格,啥用啊?

istitle

bool istitle( const std::string & str )

Return true if the string is a titlecased string and there is at least one character, i.e. uppercase characters may only follow uncased characters and lowercase characters only cased ones. Return false otherwise.

判断是否是title格式,比如You And Me 返回true, You and Me 返回false。

isupper

bool isupper( const std::string & str )

Return true if all cased characters in the string are uppercase and there is at least one cased character, false otherwise.

是否全部大写

join

std::string join( const std::string & str, const std::vector< std::string > & seq )

Return a string which is the concatenation of the strings in the sequence seq. The separator between elements is the str argument

链接子字符串,这个好啊,举个例子:

string str = "abandon and you and me";
string str1 = "you";
string str2 = "and";
string str3 = "me";
vector<string> vec;
vec.push_back(str1);
vec.push_back(str2);
vec.push_back(str3);

string strJoined = pystring::join(";", vec);   // 输出you;and;me

ljust

std::string ljust( const std::string & str, int width )

Return the string left justified in a string of length width. Padding is done using spaces. The original string is returned if width is less than str.size

就是对齐功能,这是左对齐,右边用空格补齐,不过只能用空格补齐不足的部分,不能像python那样提供fillchar

lower

std::string ljust( const std::string & str, int width )

Return a copy of the string converted to lowercase.

全部变小写,注意所有的这些转换操作,都是不改变原字符串的

lstrip

std::string lstrip( const std::string & str, const std::string & chars = "" )

Return a copy of the string with leading characters removed. If chars is omitted or None, whitespace characters are removed. If given and not "", chars must be a string; the characters in the string will be stripped from the beginning of the string this method is called on (argument "str" ).

切除左边的内容,如果不指定子字符串,就去除空格,很好用的函数

partition

void partition( const std::string & str, const std::string & sep, std::vector< std::string > & result )

Split the string around first occurance of sep. Three strings will always placed into result. If sep is found, the strings will be the text before sep, sep itself, and the remaining text. If sep is not found, the original string will be returned with two empty strings.

是个分割函数,用你指定的sep子字符串作为分割器,将字符串分为sep前,sep,sep后三个字符串,如果sep没有找到,

或是sep为空,则result里是原字符串,和两个空字符串。举个例子:

string str = "youandme";
vector<string> result;
pystring::partition(str, string("and"), result);
vector<string>::iterator it = result.begin();
for (;it!=result.end();it++){
    cout<<*it<<endl;
}

输出

you

and

me

 

replace

std::string replace( const std::string & str, const std::string & oldstr, const std::string & newstr, int count = -1)

Return a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.

替换,不多说,例子:

string str = "中国人美国人火星人";
str = pystring::replace(str, "人", "男人");

rfind

int rfind( const std::string & str, const std::string & sub, int start = 0, int end = MAX_32BIT_INT )

Return the highest index in the string where substring sub is found, such that sub is contained within sstart,end. Optional arguments start and end are interpreted as in slice notation. Return -1 on failure.

跟find一样,就是从右边过来

rindex

int rindex( const std::string & str, const std::string & sub, int start = 0, int end = MAX_32BIT_INT )

Currently a synonym of rfind. The python version raises exceptions. This one currently does not

跟indxe一样,就是从右边.

rjust

std::string rjust( const std::string & str, int width)

Return the string right justified in a string of length width. Padding is done using spaces. The original string is returned if width is less than str.size()

跟just一样,就是右对齐

rpartition

void rpartition( const std::string & str, const std::string & sep, std::vector< std::string > & result )

Split the string around last occurance of sep. Three strings will always placed into result. If sep is found, the strings will be the text before sep, sep itself, and the remaining text. If sep is not found, the original string will be returned with two empty strings.

跟partition一样,从右边开始找sep

rsplit

void rsplit( const std::string & str, std::vector< std::string > & result, const std::string & sep = "", int maxsplit = -1)

Fills the "result" list with the words in the string, using sep as the delimiter string. Does a number of splits starting at the end of the string, the result still has the split strings in their original order. If maxsplit is > -1, at most maxsplit splits are done. If sep is "", any whitespace string is a separator.

好东西,把str按照sep分隔开来,默认用空格分隔,注意虽然叫rsplit,但是分隔后顺序不变

rstrip

std::string rstrip( const std::string & str, const std::string & chars = "" );

Return a copy of the string with trailing characters removed. If chars is "", whitespace characters are removed. If not "", the characters in the string will be stripped from the end of the string this method is called on.

跟strip一样,从右开始

slice

std::string slice( const std::string & str, int start = 0, int end = MAX_32BIT_INT)

Function matching python's slice functionality.

切片功能,跟substr基本一样

split

void split( const std::string & str, std::vector< std::string > & result, const std::string & sep = "", int maxsplit = -1)

Fills the "result" list with the words in the string, using sep as the delimiter string. If maxsplit is > -1, at most maxsplit splits are done. If sep is "", any whitespace string is a separator.

同rsplit,从左边

splitlines

void splitlines( const std::string & str, std::vector< std::string > & result, bool keepends = false )

Return a list of the lines in the string, breaking at line boundaries. Line breaks are not included in the resulting list unless keepends is given and true.

将str分隔成行,注意的是分隔符,举个列子:

string str = "you and me\n\r we are good friend\n\r No\n\r You are fuck\n\r";
vector<string> vec;
pystring::splitlines(str, vec, true);

vector<string>::iterator it = vec.begin();
for (;it!=vec.end();it++) {
    cout<<*it;
}

startswith

bool startswith( const std::string & str, const std::string & prefix, int start = 0, int end = MAX_32BIT_INT )

Return True if string starts with the prefix, otherwise return False. With optional start, test string beginning at that position. With optional end, stop comparing string at that position.

跟endswith是一对

strip

std::string strip( const std::string & str, const std::string & chars = "" )

Return a copy of the string with leading and trailing characters removed. If chars is "", whitespace characters are removed. If given not "", the characters in the string will be stripped from the both ends of the string this method is called on.

跟rstrip是一对

swapcase

std::string swapcase( const std::string & str )

Return a copy of the string with uppercase characters converted to lowercase and vice versa.

对转大小写,不知道有啥用

title

std::string title( const std::string & str )

Return a titlecased version of the string: words start with uppercase characters, all remaining cased characters are lowercase.

将字符串变成title格式,参见istitle

translate

std::string translate( const std::string & str, const std::string & table, const std::string & deletechars = "")

Return a copy of the string where all characters occurring in the optional argument deletechars are removed, and the remaining characters have been mapped through the given translation table, which must be a string of length 256.

这个函数作者显然是没太放在心上,随便写写的。即没有maketrans函数,还有个bug。

bug我已经提交上去了,自己改下先用着吧, maketrans函数也自己写了个。看下面的:

std::string maketrans(const std::string from, const std::string to)
{

    std::string::size_type len = from.size();
    if (len != to.size())
    {
        return NULL;
    }

    std::string trans_table(256, '\0');
    for (int i=0; i<256; i++)
    {
        trans_table[i] = i;
    }

    for (int j=0; j<len; j++)
    {
        trans_table[from[j]] = to[j];
    }

    return trans_table;
}

 

bug修复:

in line 765
  s += table[ s[i] ];
should be:
  s += table[ str[i] ];
用法举个例子吧:

string from="abcd";

string to="ABCD";

string table = pystring::maketrans(from, to);

string aa = pystring::translate(str, table, "m");

cout<<aa;

输出you AnD e

upper

std::string upper( const std::string & str )

Return a copy of the string converted to uppercase.

全部变大写

zfill

std::string zfill( const std::string & str, int width )

Return the numeric string left filled with zeros in a string of length width. The original string is returned if width is less than str.size().

左边用0补齐

原文链接: https://www.cnblogs.com/ulihj/archive/2010/12/22/1913744.html

欢迎关注

微信关注下方公众号,第一时间获取干货硬货;公众号内回复【pdf】免费获取数百本计算机经典书籍

    用pystring处理字符串

原创文章受到原创版权保护。转载请注明出处:https://www.ccppcoding.com/archives/19136

非原创文章文中已经注明原地址,如有侵权,联系删除

关注公众号【高性能架构探索】,第一时间获取最新文章

转载文章受原作者版权保护。转载请注明原作者出处!

(0)
上一篇 2023年2月7日 下午8:08
下一篇 2023年2月7日 下午8:08

相关推荐