Make a word cloud map generator here to send to you (with code), suggest collecting

Preface

The text and pictures of this article are from the Internet, only for learning and communication, not for any commercial purpose. The copyright belongs to the original author. If you have any questions, please contact us in time for handling.

Author: Yankai

If you want to learn Python or are learning python, there are many Python tutorials, but are they up to date? Maybe you have learned something that someone else probably learned two years ago. Share a wave of the latest Python tutorials in 2020 in this editor. Access to the way, private letter small "information", you can get free Oh!

A few days ago, when making ppt, there were a lot of keywords in the content to be displayed. I thought of the cloud character map. I didn't expect to search online. The website generated online either needs to pay or doesn't support the generation of specified shapes. I wrote one by myself in a fit of anger. Fortunately, python has this library wordcloud, so it will soon implement this function. After optimization (slag optimization), I think it can be used basically.

The results are as follows:

 

Environmental Science

The version used is Python 3.6.5 (Anaconda)

  • wordcloud cloud cloud graph generation library
  • matplotlib drawing library, often used with numpy
  • numpy array function library, necessary for matrix operation
  • PIL image operation module
  • Random random number module
  • Chinese word segmentation module of jieba

Code

# -*- coding: utf-8 -*-
"""
Created on Fri Sep 29 13:14:22 2019
        self.
@author: Wild goose
"""

from wordcloud import (WordCloud, ImageColorGenerator)
from matplotlib import pyplot as plt
import numpy as np
from PIL import Image
from random import randint
import jieba

lines=[]
with open('config.txt','r') as handler:
    lines = handler.readlines()

a,b = map(int,lines[3].split(','))  # colour h Scope of
bgcolor = lines[7].strip('\n').lower()  # background color 
if bgcolor == 'none':
    bgcolor = None
w,h = map(int,lines[9].split(',')) # Cloud size

def font_select():
    font_input = lines[5].strip('\n')
    font_dict = {'Microsoft YaHei ':r'\msyh.ttc','Regular script':r'\simkai.ttf','Song style':r'\simsun.ttc',
             'Imitation song':r'\simfang.ttf','Official script':r'\SIMLI.TTF','Times New Roman':r'times.ttf'}
    return r'C:\Windows\Fonts' + font_dict[font_input]   # Typeface


def shape_judge():  # Judge the shape selected by the user
    cloud_shape = lines[11].strip('\n').lower()  # Cloud shape
    if cloud_shape == 'rectangle':
        mask = None
    elif cloud_shape == 'round':
        x,y = np.ogrid[:w,:w]
        mask = 255*((x-w/2) ** 2 + (y-w/2) ** 2 > 4.2*w ** 2).astype(int) 
        # with(w/2,w/2)Is the center of the circle with a radius of 4.2*w I don't know what to do, but I have to
    else:
        mask = np.array(Image.open(r"shape.jpg"))
    return mask


def color_judge():   
    font_color_input = int(lines[1].strip('\n'))
    if font_color_input == 1:
        random_color = np.array(Image.open(r"shape.jpg")) 
        return ImageColorGenerator(random_color)
        #Font color is the color of the background picture
    else:
        def random_color_func(word=None, font_size=None, position=None, 
                              orientation=None, font_path=None, random_state=None):
            h,s,l = randint(a,b),randint(80,100),randint(25,50)
                # h That is, the value range of color. We can Google colour picker View the corresponding h value
                # Then set in the parameter color_func = random_color_func You can set the color of the word according to the range
            return "hsl({}, {}%, {}%)".format(h, s, l)

def segment_words(text):
    article_contents = ""
    #Use jieba Participle
    words = jieba.cut(text,cut_all=False)
    for word in words:
        #Use spaces to separate words, otherwise phrases are still together
        article_contents += word+" "
    return article_contents
        
def segment_judge():  # Judging word segmentation mode
    segment_mode = int(lines[13].strip('\n'))
    text=open(u'word.txt','r').read().lower()
    if segment_mode == 1:
        return text
    else:
        return segment_words(text)

stopwords = {'.',',','"',':','(',')','.','. ','(',')','[',']','"','"','\n','\t',' '}

wordcloud= WordCloud(font_path=font_select(), background_color=bgcolor, mode="RGBA", 
                     color_func = color_judge(),mask= shape_judge(),width=w, 
                     height=h,stopwords = stopwords,margin=2).generate(segment_judge())

# You can go through font path Parameter to set the font set
# width, height, margin Picture properties can be set
# backgroud_color = "black",You can set the background color, the default is black. If you want to set the transparency, you can follow the above code
# stopwords Stop words are phrases that are not displayed in the cloud word map, such as punctuation and line wrapping.
 
plt.axis("off")  # Draw axis or not
plt.show()
wordcloud.to_file('wordcloud.png')

configuration file

In order to make complaints about user (or me), I configure a config.txt file (not Tucao TXT, pure white), and all input in the code comes from congfig.

#1. Font color mode, optional: picture color (1, draw according to the picture color of picture mode in option 6, if you select mode 1, you don't need to change option 2, and you need to keep option 6 as picture), custom color (2, the range of custom color, you need to set option 2)
2
 #2. For the value range of font color H, see https://www.webfx.com/web-design/color-picker/
40,80
 #3. Font, optional: Microsoft YaHei, Kaiti, Songti, Fangsong, Lishu, Times New Roman
 Microsoft YaHei 
#4. Background color, written in English as white, default as transparent (None)
None
 #5. Size, width and height of the cloud figure
1000,800
 #6. Cloud shape, optional: square (width, height), round (width, width), or picture (picture, drawn according to the shape.jpg in the same directory, size is picture size)
picture
 #7. Word segmentation mode, optional: custom mode (1, draw according to the phrase you give, separate the phrases with spaces), automatic word segmentation mode (2, give a paragraph of speech, the program will automatically segment the words and draw according to the frequency)
1

Use

You should have the following files in your project directory:

Cloud word map generation

config.txt

shape.jpg

word.txt

Cloud word map generation.py

Among them, shape.jpg is the selected image shape. After setting option 6 to picture in config, the cloud word image will be generated according to the shape of the image.

The best background of shape is white, which is the best effect.

Speaking of this, can you see who the renderings at the beginning of the article are?

Guess who he is

common problem

A common problem is how to configure my Font. If you actually operate it, you will find, "ah, I pointed to Microsoft YaHei. ttf in the Font folder clearly." why did you say that there was no Font?

It's very simple, because the font name you see in font is not its real font name. For example, "Microsoft YaHei" is actually "msyh". So how can I see its real filename? A convenient way is to copy it to the desktop, and it will automatically display the real name.

Oh, by the way, the address of the font folder is C:\Windows\Fonts

Error in address bar splicing

You can see that when splicing font file strings, r'C:\Windows\Fonts'+r'\SIMLI.TTF' is used instead of r'c: \ windows \ fonts \ '& r'simli. TTF'. Because when the last bit of the string is \, string splicing and other operations cannot be performed, even if R 'escape is used, SyntaxError: EOL while scanning string literal will be reported. This seems to be a bug of python. The solution is as follows: put \ at the beginning of the second string.

If you want to learn Python or are learning python, there are many Python tutorials, but are they up to date? Maybe you have learned something that someone else probably learned two years ago. Share a wave of the latest Python tutorials in 2020 in this editor. Access to the way, private letter small "information", you can get free Oh!

 

Tags: Python Windows Anaconda Google

Posted on Sat, 02 May 2020 10:33:47 -0700 by gdboling