Python實(shí)現(xiàn)隨機(jī)生成一個(gè)漢字的方法分享

更新時(shí)間：2023年01月09日 10:17:57 作者：夢(mèng)想橡皮擦

這篇文章主要為大家詳細(xì)介紹了Python如何實(shí)現(xiàn)隨機(jī)生成一個(gè)漢字的功能，文中的示例代碼講解詳細(xì)，對(duì)我們深入了解Python有一定的幫助，需要的可以參考一下

需求來(lái)源

在編寫爬蟲(chóng)訓(xùn)練場(chǎng) 項(xiàng)目時(shí)，碰到一個(gè)隨機(jī)頭像的需求，這里用漢字去隨機(jī)生成。

模擬的效果如下所示，輸入一組漢字，然后返回一張圖片。

接口地址如下所示：

https://ui-avatars.com/api/?name=夢(mèng)想橡皮擦&background=03a9f4&color=ffffff&rounded=true

其中參數(shù)說(shuō)明如下：

name：待生成的文字內(nèi)容；
background：背景色；
color：前景色；
rounded：是否圓形。

我們?cè)谙乱黄┛屯瓿缮蓤D片效果，本篇先實(shí)現(xiàn)隨機(jī)漢字生成。

隨機(jī)漢字

生成隨機(jī)漢字的模塊不是 Python 自帶的功能，但是你可以使用 Python 的 random 模塊來(lái)生成隨機(jī)數(shù)，然后使用 Unicode 編碼來(lái)獲取對(duì)應(yīng)的漢字。

下面是一個(gè)簡(jiǎn)單的例子，它生成一個(gè)隨機(jī)的漢字：

import random

def get_random_char():
    # 漢字編碼的范圍是0x4e00 ~ 0x9fa5
    val = random.randint(0x4e00, 0x9fa5)
    # 轉(zhuǎn)換為Unicode編碼
    return chr(val)

print(get_random_char())

如果你想生成多個(gè)隨機(jī)漢字，可以使用一個(gè)循環(huán)來(lái)調(diào)用 get_random_char() 函數(shù)，并將生成的漢字拼接起來(lái)。

下面的代碼生成了 5 個(gè)隨機(jī)漢字：

import random

def get_random_char():
    # 漢字編碼的范圍是0x4e00 ~ 0x9fa5
    val = random.randint(0x4e00, 0x9fa5)
    # 轉(zhuǎn)換為Unicode編碼
    return chr(val)

# 生成5個(gè)隨機(jī)漢字
random_chars = ""
for i in range(5):
    random_chars += get_random_char()

print(random_chars)

隨機(jī)生成常用漢字

直接使用 Unicode 編碼，會(huì)出現(xiàn)很生僻字，在實(shí)戰(zhàn)中可以使用部分策略解決該問(wèn)題，例如找一篇長(zhǎng)文，將其存儲(chǔ)到一個(gè)文本文件中，然后使用 Python 的讀寫文件功能來(lái)讀取文件中的漢字。

從互聯(lián)網(wǎng)找一段文字，添加到 demo.txt 中，用于后續(xù)生成隨機(jī)漢字。

從 demo.txt 中讀取文字，這里再補(bǔ)充一個(gè)步驟，由于隨機(jī)生成的文本中會(huì)有標(biāo)點(diǎn)符號(hào)，所以需要進(jìn)行去除。使用 Python 的字符串方法 translate 來(lái)實(shí)現(xiàn)。

import string

s = "Hello, xiangpica! How are you today?"

# 創(chuàng)建字符映射表
translator = str.maketrans('', '', string.punctuation)

# 使用字符映射表去除標(biāo)點(diǎn)符號(hào)
s = s.translate(translator)

print(s)

結(jié)果該方法僅能去除 ASCII 編碼的標(biāo)點(diǎn)符號(hào)（例如 !、? 等）。如果去除 Unicode 編碼的標(biāo)點(diǎn)符號(hào)，還需要切換方法。

最終選定的模塊是 Python 的 unicodedata 模塊，其提供了一系列的函數(shù)，可以幫助你處理 Unicode 字符的相關(guān)信息。

下面是一些常用的 unicodedata 函數(shù)：

import unicodedata

print(unicodedata.name('X'))
print(unicodedata.name('0'))
print(unicodedata.name('@'))

unicodedata.lookup(name)：返回給定名稱的 Unicode 字符。

import unicodedata

print(unicodedata.lookup('LATIN CAPITAL LETTER X'))    # X
print(unicodedata.lookup('DIGIT ZERO'))    # 0
print(unicodedata.lookup('COMMERCIAL AT'))    # @

可以使用 unicodedata.category() 函數(shù)來(lái)判斷一個(gè)字符是否是標(biāo)點(diǎn)符號(hào)，然后再使用 translate() 函數(shù)來(lái)去除標(biāo)點(diǎn)符號(hào)。

下面這段代碼，就是使用了 unicodedata 模塊和 translate() 函數(shù)來(lái)去除一段文本中的所有標(biāo)點(diǎn)符號(hào)：

import unicodedata

s = "Hello, xiangpica! How are you today?"

# 創(chuàng)建字符映射表
translator = {ord(c): None for c in s if unicodedata.category(c).startswith('P')}

# 使用字符映射表去除標(biāo)點(diǎn)符號(hào)
s = s.translate(translator)
print(s)

將上述代碼集成到 Python 隨機(jī)生成漢字中，示例如下。

import random
import unicodedata

def get_random_common_char():
    # 讀取文件中的常用漢字
    with open('demo.txt', 'r',encoding='utf-8') as f:
        common_chars = f.read()
        # 去除空格
        common_chars = common_chars.replace(' ','')
        common_chars = common_chars.strip()

        # 創(chuàng)建字符映射表
        translator = {ord(c): None for c in common_chars if unicodedata.category(c).startswith('P')}

        # 使用字符映射表去除標(biāo)點(diǎn)符號(hào)
        s = common_chars.translate(translator)


    return random.choice(s)

print(get_random_common_char())

隨機(jī)生成五個(gè)漢字

random_chars = ""
for i in range(5):
    random_chars += get_random_common_char()
print(random_chars)

到此這篇關(guān)于Python實(shí)現(xiàn)隨機(jī)生成一個(gè)漢字的方法分享的文章就介紹到這了,更多相關(guān)Python隨機(jī)生成漢字內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: