[Guava] - pit encountered when using Iterators for grouping

In general, we need to batch a large list. In general, we use the iterators ා partition and padded partition methods (you can also use lists ා partition) for specific usage:

List<Order> result = Lists.newArrayListWithCapacity(orders.size());
for (List<String> orderList : Iterables.paddedPartition(orders, DEFAULT_MAX_SIZE)) {
    //doSometime.....
}

When using paddedPartition, there will be a problem. When the data of the last batch is less than DEFAULT_MAX_SIZE, the remaining data will be filled with null, for example

list = [1,2,3,4,5] -------paddedPartition(list,3)-------->[[1,2],[3,4],[5,null]]

Let's look at the source code

public static <T> UnmodifiableIterator<List<T>> paddedPartition(Iterator<T> iterator, int size) {
    return partitionImpl(iterator, size, true);
  }

  private static <T> UnmodifiableIterator<List<T>> partitionImpl(
      final Iterator<T> iterator, final int size, final boolean pad) {
    checkNotNull(iterator);
    checkArgument(size > 0);
    return new UnmodifiableIterator<List<T>>() {
      @Override
      public boolean hasNext() {
        return iterator.hasNext();
      }

      @Override
      public List<T> next() {
        if (!hasNext()) {
          throw new NoSuchElementException();
        }
        Object[] array = new Object[size];
        int count = 0;
        for (; count < size && iterator.hasNext(); count++) {
          array[count] = iterator.next();
        }
        for (int i = count; i < size; i++) {
          array[i] = null; // It is important to judge that the remaining data of the next last page is set to null in this step
        }

        @SuppressWarnings("unchecked") // we only put Ts in it
        List<T> list = Collections.unmodifiableList((List<T>) Arrays.asList(array));
      // Here, we will judge whether to return the list according to the pad and regenerate a list 
        return (pad || count == size) ? list : list.subList(0, count);
      }
    };
  }

Let's look at his twin ways

 public static <T> UnmodifiableIterator<List<T>> partition(Iterator<T> iterator, int size) {
    return partitionImpl(iterator, size, false);
  }

The same method is used. One pad is set to true, and the other is set to false

In addition, we mentioned the method of lists ා partition above, which is implemented by building two new classes

public static <T> List<List<T>> partition(List<T> list, int size) {
    checkNotNull(list);
    checkArgument(size > 0);
    return (list instanceof RandomAccess)
        ? new RandomAccessPartition<T>(list, size)
        : new Partition<T>(list, size);
  }

Tags: less

Posted on Sun, 05 Apr 2020 08:47:29 -0700 by bpp198