python basic learning day7

Supplement of basic data type; advanced coding

  • str

    • capitalize() first letter (first word) upper case, others lower case

      s1 = 'I LIVE YOU'
      print(s1.capitalize())
      >>>I live you
    • title() capitalizes each word. (a word is separated by special characters (not letters)

      s1 = 'I LIVE YOU'
      print(s1.title())
      >>>I Live You
    • swapcase() case reversal

      s1 = 'I love YOU'
      print(s1.swapcase())
      >>>i LOVE you
    • center() is centered, with 1 required parameter: width, 1 non required parameter: fill)

      s1 = 'I'
      print(s1.center(10,'%'))
      >>>%%%%I%%%%%
    • find() looks for the index through the element, returns when it finds the first one, and returns - 1 when it cannot be found.

      s1 = 'I love you'
      print(s1.find('o'))
      >>>3
    • index() finds the index through the element, returns the first one, and reports an error if it cannot be found.

  • tuple

    • Particularity: there is only one element in the primitive, and there is no ',', then it is not the primitive, and it is consistent with the data type in brackets

      tu0 = (1,2)
      print(tu0,type(tu0))
      >>>(1, 2) <class 'tuple'>
      
      tu1 = (1)
      print(tu1,type(tu1))
      >>>1 <class 'int'>
      
      tu2 = ([1])
      print(tu2,type(tu2))
      >>>[1] <class 'list'>
      
      tu3 = (1,)
      print(tu3,type(tu3))
      >>>(1, ) <class 'tuple'>
    • count() count

      tu = (1,2,3,3,3,3)
      print(tu.count())
      >>>4
    • index() find index

      tu = ['a','b','a']
      print(tu.index('a'))
      >>>0
  • list

    • index() finds the index through the element

      l1 = ['a','b','a']
      print(l1.index('a'))
      >>>0
    • sort() sorts from small to large by default. Set the reverse parameter to sort from small to large

      l1 = [3,2,1,4]
      l1.sort()
      print(l1)
      >>>[1,2,3,4]
      
      l1.sort(reverse=True)
      print(l1)
      >>>[4,3,2,1]
    • reverse() reverse

      l1 = [2,1,3,0]
      l1.reverse()
      print(l1)
      >>>[0,3,1,2]
    • Add list (version 3.4 and above)

      l1 = [1,2,3]
      l2 = [3,4,5]
      print(l1+l2)
      >>>[1, 2, 3, 3, 4, 5]
    • Multiply list by number (above 3.4)

      l1 = [2,'a',[1,'b']]
      l2 = l1*3
      print(l2)
      >>>[2, 'a', [1, 'b'], 2, 'a', [1, 'b'], 2, 'a', [1, 'b']]
    • Particularity of list: if an element is deleted in a forward cycle of a list, all elements behind the element will advance one bit, and their indexes will also advance one bit compared with the previous ones. Therefore, in the process of cycling a list, if you want to change the size of the list (increase or delete values), the result is likely to be wrong or wrong.

      l1 = [1,2,3,4,5,6]  #Delete even indexed elements in the list.
      for i in range(0,len(l1),2):
          l1.pop(i)
      print(l1)
      >>>IndexError: pop index out of range
      • There are three ways to solve this problem

        1. Delete directly (delete by element, delete by index, slice by step

        #Slice plus step
        l1 = [1,2,3,4,5,6]
        del l1[1::2]
        print(l1)

        2. Reverse deletion

        l1 = [1,2,3,4,5,6]
        for i in range(len(l1)-1,-1,-2): 
            l1.pop(i)   
        print(l1)
        >>>[1,3,5]
        
        #The following code cannot be used; please test yourself
        l1 = [1,2,3,4,5,6]
        for i in range(1,len(l1),2):
            l1.pop(-i)

        3. Thinking transformation

        l1 = [1,2,3,4,5,6]
        l2 = []
        for i in range(0,len(l1),2): 
            l2.append(l1[i])
        l1 = l2
        print(l1)
  • dict

    • Before version 3.5 of popitem, delete randomly. After version 3.6, delete the last one with return value. Please self testing.

    • update

      dic0 = {1:'i'}
      dic0.update(2='love',hobby='python')  #Add key value pair
      print(dic0)
      >>>{1: 'i', 2: 'love', 'hobby': 'python'}
      
      dic0.update(1 = 'sunlight') #Key value pairs
      print(dic0)
      >>>{1: 'sunlight', 2: 'love', 'hobby': 'python'}
      
      dic1 = {}
      dic1.update([(1,'a'),(2,'b'),(3,'c')])
      print(dic1)
      >>>{1: 'a', 2: 'b', 3: 'c'}
      
      dic0.update(dic1)
      print(dic0)   #If there is, it will be covered; if there is no, it will be increased
      >>>{1: 'a', 2: 'b', 'hobby': 'python', 3: 'c'}
      print(dic1)
      >>>{1: 'a', 2: 'b', 3: 'c'}
    • The first parameter of fromkeys() must be an iteratable object, which shares the second parameter (the same id).

      dic = dict.fromkeys('abc',1)     
      print(dic)
      >>>{'a': 1, 'b': 1, 'c': 1}
      
      dic = dict.fromkeys([1,2,3],[])
      print(dic)
      >>>{1: [], 2: [], 3: []}
      dic[1].append('a')
      print(dic)
      >>>{1: ['a'], 2: ['a'], 3: ['a']}
    • Small question test: (when cycling a dictionary, if you change the size of the dictionary, you may make a mistake.)

      #Delete key value pairs starting with 'k' in dictionary dic
      dic = {'k1':'a','k2':'b','k3':'c','a':'d'}
      l1 = []
      for key in dic:
          if key.startswith('k'):
              l1.append(key)
      for i in l1:
          dic.pop(i)
      print(dic)
      >>>{'a':'d'}
      
      
      #Improvement
      for key in list(dic.keys()):   #Convert it to a list. If you don't add a list, you will get an error.
          if 'k' in key:
              dic.pop(key)
      print(dic)
      >>>{'a','d'}
  • Conversion of types between data:

    • int bool str conversion

    • str list conversion

    • list set conversion

    • str bytes conversion

    • All data can be converted to bool value:

      The data types converted to bool value False are:

      '',0,(),{},[],set(),None
  • Summary of infrastructure types

    • According to the occupancy of storage space (from low to high)
      • int
      • str
      • set: Disorder
      • tuple: orderly, immutable
      • list: orderly, variable
      • dict: ordered (after version 3.6), variable
  • Advanced coding:

    • Different coding methods can't recognize each other

    • All data in memory is encoded in Unicode, but when data is used for network transmission or stored in hard disk, it must be encoded in non Unicode (utf-8, gbk, etc.)

    • When data in python is stored from memory (Unicode encoding) to hard disk or transmitted over the network, it needs to undergo a special conversion process. Only when it is converted to a special data of non Unicode encoding type can it be transmitted or stored to hard disk, that is, bytes type (encoding method in memory: non Unicode)

      • bytes and str operate in the same way
      • bytes can only contain ASCII literal characters. If you manually convert a Chinese string to a byte type, an error will be reported, which requires special conversion
      #str to bytes:
      a = b'iloveyou'
      print(a,type(a))
      >>>b'iloveyou' <class 'bytes'>
      
      #To convert Chinese to bytes:
      b = b'The mountain is right there'
      print(b)
      >>>SyntaxError: bytes can only contain ASCII literal characters
      #The correct method is:
      
      c = 'The mountain is right there'
      b = c.encode('utf-8')
      print(b)   #or print(c.encode('utf-8'))    #Generally, specify the encoding form of utf-8, (encode: encoding)
      >>>b'\xe5\xb1\xb1\xe5\xb0\xb1\xe5\x9c\xa8\xe9\x82\xa3\xe5\x84\xbf'
      • bytes can be converted to a string type (Unicode) (decode, decode). Whatever encoding type is used to convert to byte data type is decoded.

        b = b'\xe5\xb1\xb1\xe5\xb0\xb1\xe5\x9c\xa8\xe9\x82\xa3\xe5\x84\xbf'
        c = b.decode('utf-8')    or print(b.decode('utf-8'))
        print(c)
        >>>The mountain is right there
        
        #Whatever encoding type is used to convert to byte data type is decoded.
        b = b'\xe5\xb1\xb1\xe5\xb0\xb1\xe5\x9c\xa8\xe9\x82\xa3\xe5\x84\xbf'
        c = b.decode('gbk')    
        print(c)
        >>>UnicodeDecodeError: 'gbk' codec can't decode byte 0xbf in position 14: incomplete multibyte sequence
  • Small question test: gbk to utf-8

    #According to the analysis, all the codes are related to unicode (in the computer memory, they are encoded in Unicode), so you can convert gbk to unicode first, and then to utf-8.
    
    gbk = b'\xc9\xbd\xbe\xcd\xd4\xda\xc4\xc7\xb6\xf9'
    decode1 = gbk.decode('gbk')    #The string decoded to Unicode can be viewed by print(decode1).
    print(decode1.encode('utf-8'))  #utf-8 encoding
    >>>b'\xe5\xb1\xb1\xe5\xb0\xb1\xe5\x9c\xa8\xe9\x82\xa3\xe5\x84\xbf'

‚Äč

Tags: Python encoding network ascii

Posted on Tue, 17 Mar 2020 04:58:01 -0700 by Salsaboy