03-16 04:37:09.696 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49    cmd_run(): start to cmd run: grep Port /etc/ssh/sshd_config
03-16 04:37:09.696 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): grep
03-16 04:37:09.696 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): Port
03-16 04:37:09.696 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): /etc/ssh/sshd_config
#Port 22
#GatewayPorts no
03-16 04:37:09.700 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 04:37:09.700 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): nvidia-smi
Wed Mar 16 04:37:09 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.82.01    Driver Version: 470.82.01    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000001:00:00.0 Off |                    0 |
| N/A   38C    P0    40W / 300W |      0MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-SXM2...  On   | 00000002:00:00.0 Off |                    0 |
| N/A   41C    P0    44W / 300W |      0MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  Tesla V100-SXM2...  On   | 00000003:00:00.0 Off |                    0 |
| N/A   39C    P0    42W / 300W |      0MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  Tesla V100-SXM2...  On   | 00000004:00:00.0 Off |                    0 |
| N/A   41C    P0    41W / 300W |      0MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   4  Tesla V100-SXM2...  On   | 00000005:00:00.0 Off |                    0 |
| N/A   37C    P0    41W / 300W |      0MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   5  Tesla V100-SXM2...  On   | 00000006:00:00.0 Off |                    0 |
| N/A   40C    P0    42W / 300W |      0MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   6  Tesla V100-SXM2...  On   | 00000007:00:00.0 Off |                    0 |
| N/A   40C    P0    44W / 300W |      0MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   7  Tesla V100-SXM2...  On   | 00000008:00:00.0 Off |                    0 |
| N/A   41C    P0    41W / 300W |      0MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
03-16 04:37:14.822 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49    cmd_run(): start to cmd run: ifconfig
03-16 04:37:14.822 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): ifconfig
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        ether 02:42:5e:07:9c:92  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.0.0.8  netmask 255.255.224.0  broadcast 10.0.31.255
        inet6 fe80::222:48ff:fe77:1ca5  prefixlen 64  scopeid 0x20<link>
        ether 00:22:48:77:1c:a5  txqueuelen 1000  (Ethernet)
        RX packets 167468324  bytes 88391604986 (88.3 GB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 113298509  bytes 151614183851 (151.6 GB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 172540795  bytes 9259304630 (9.2 GB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 172540795  bytes 9259304630 (9.2 GB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

03-16 04:37:14.826 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49    cmd_run(): start to cmd run: df -h
03-16 04:37:14.826 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): df
03-16 04:37:14.826 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): -h

03-16 04:37:14.830 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49    cmd_run(): start to cmd run: ls /dev
03-16 04:37:14.831 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): ls
03-16 04:37:14.831 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): /dev
autofs
bsg
btrfs-control
console
core
cpu_dma_latency
cuse
dri
ecryptfs
fb0
fd
full
fuse
gdrdrv
hpet
hwrng
infiniband
input
isst_interface
kmsg
knem
loop0
loop1
loop2
loop3
loop4
loop5
loop6
loop7
loop-control
mapper
mcelog
mem
mqueue
net
null
nvidia0
nvidia1
nvidia2
nvidia3
nvidia4
nvidia5
nvidia6
nvidia7
nvidia-caps
nvidiactl
nvidia-modeset
nvidia-nvswitchctl
nvidia-uvm
nvidia-uvm-tools
nvram
port
ppp
psaux
ptmx
ptp0
ptp1
pts
random
rfkill
rtc0
sda
sda1
sda14
sda15
sdb
sdb1
sg0
sg1
shm
snapshot
stderr
stdin
stdout
tty
tty0
tty1
tty10
tty11
tty12
tty13
tty14
tty15
tty16
tty17
tty18
tty19
tty2
tty20
tty21
tty22
tty23
tty24
tty25
tty26
tty27
tty28
tty29
tty3
tty30
tty31
tty32
tty33
tty34
tty35
tty36
tty37
tty38
tty39
tty4
tty40
tty41
tty42
tty43
tty44
tty45
tty46
tty47
tty48
tty49
tty5
tty50
tty51
tty52
tty53
tty54
tty55
tty56
tty57
tty58
tty59
tty6
tty60
tty61
tty62
tty63
tty7
tty8
tty9
ttyprintk
ttyS0
ttyS1
ttyS10
ttyS11
ttyS12
ttyS13
ttyS14
ttyS15
ttyS16
ttyS17
ttyS18
ttyS19
ttyS2
ttyS20
ttyS21
ttyS22
ttyS23
ttyS24
ttyS25
ttyS26
ttyS27
ttyS28
ttyS29
ttyS3
ttyS30
ttyS31
ttyS4
ttyS5
ttyS6
ttyS7
ttyS8
ttyS9
udmabuf
uhid
uinput
urandom
userio
vcs
vcs1
vcs2
vcs3
vcs4
vcs5
vcs6
vcsa
vcsa1
vcsa2
vcsa3
vcsa4
vcsa5
vcsa6
vcsu
vcsu1
vcsu2
vcsu3
vcsu4
vcsu5
vcsu6
vfio
vga_arbiter
vhost-net
vhost-vsock
vmbus
zero
zfs

+ ulimit -n
262144
+ cat /etc/security/limits.conf
# /etc/security/limits.conf
#
#Each line describes a limit for a user in the form:
#
#<domain>        <type>  <item>  <value>
#
#Where:
#<domain> can be:
#        - a user name
#        - a group name, with @group syntax
#        - the wildcard *, for default entry
#        - the wildcard %, can be also used with %group syntax,
#                 for maxlogin limit
#        - NOTE: group and wildcard limits are not applied to root.
#          To apply a limit to the root user, <domain> must be
#          the literal username root.
#
#<type> can have the two values:
#        - "soft" for enforcing the soft limits
#        - "hard" for enforcing hard limits
#
#<item> can be one of the following:
#        - core - limits the core file size (KB)
#        - data - max data size (KB)
#        - fsize - maximum filesize (KB)
#        - memlock - max locked-in-memory address space (KB)
#        - nofile - max number of open files
#        - rss - max resident set size (KB)
#        - stack - max stack size (KB)
#        - cpu - max CPU time (MIN)
#        - nproc - max number of processes
#        - as - address space limit (KB)
#        - maxlogins - max number of logins for this user
#        - maxsyslogins - max number of logins on the system
#        - priority - the priority to run user process with
#        - locks - max number of file locks the user can hold
#        - sigpending - max number of pending signals
#        - msgqueue - max memory used by POSIX message queues (bytes)
#        - nice - max nice priority allowed to raise to values: [-20, 19]
#        - rtprio - max realtime priority
#        - chroot - change root to directory (Debian-specific)
#
#<domain>      <type>  <item>         <value>
#

#*               soft    core            0
#root            hard    core            100000
#*               hard    rss             10000
#@student        hard    nproc           20
#@faculty        soft    nproc           20
#@faculty        hard    nproc           50
#ftp             hard    nproc           0
#ftp             -       chroot          /ftp
#@student        -       maxlogins       4

# End of file
+ ulimit -n 999999
+ ulimit -Hn 999999
+ ulimit -Sn 999999
+ ulimit -n
999999
+ pip install -r requirements.txt
Collecting git+https://github.com/rwightman/pytorch-image-models.git (from -r requirements.txt (line 49))
  Cloning https://github.com/rwightman/pytorch-image-models.git to /tmp/pip-req-build-8enxxbsc
  Running command git clone -q https://github.com/rwightman/pytorch-image-models.git /tmp/pip-req-build-8enxxbsc
Requirement already satisfied: Deprecated in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 1)) (1.2.13)
Requirement already satisfied: pymongo in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 2)) (4.0.2)
Requirement already satisfied: azure in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 3)) (2.0.0)
Requirement already satisfied: azure-storage-blob==2.1.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 4)) (2.1.0)
Requirement already satisfied: Cython in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 5)) (0.29.23)
Requirement already satisfied: django in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 6)) (3.2.12)
Requirement already satisfied: easydict in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 7)) (1.9)
Requirement already satisfied: ete3 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 8)) (3.1.2)
Requirement already satisfied: future in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 9)) (0.18.2)
Requirement already satisfied: ipython in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 10)) (7.32.0)
Requirement already satisfied: jinja2 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 11)) (3.0.3)
Requirement already satisfied: scikit-image in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 12)) (0.19.2)
Requirement already satisfied: matplotlib in /opt/conda/lib/python3.7/site-packages/matplotlib-3.4.2-py3.7-linux-x86_64.egg (from -r requirements.txt (line 13)) (3.4.2)
Requirement already satisfied: nltk in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 14)) (3.7)
Requirement already satisfied: opencv-python in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 15)) (4.5.5.64)
Requirement already satisfied: orderedset in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 16)) (2.0.3)
Requirement already satisfied: pathos in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 17)) (0.2.8)
Requirement already satisfied: pillow in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 18)) (8.1.0)
Requirement already satisfied: progressbar in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 19)) (2.5)
Requirement already satisfied: protobuf in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 20)) (3.19.4)
Requirement already satisfied: psutil in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 21)) (5.9.0)
Requirement already satisfied: python-magic in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 23)) (0.4.25)
Requirement already satisfied: pyyaml in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 24)) (5.4.1)
Requirement already satisfied: simplejson in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 25)) (3.17.6)
Requirement already satisfied: traceback2 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 26)) (1.4.0)
Requirement already satisfied: tb-nightly in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 27)) (2.9.0a20220313)
Requirement already satisfied: yacs in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 28)) (0.1.8)
Requirement already satisfied: sklearn in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 29)) (0.0)
Requirement already satisfied: torchlars in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 30)) (0.1.2)
Requirement already satisfied: boto3 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 31)) (1.21.20)
Requirement already satisfied: anytree in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 32)) (2.8.0)
Requirement already satisfied: Ninja in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 33)) (1.10.0.post2)
Requirement already satisfied: pytorch_lamb in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 34)) (1.0.0)
Requirement already satisfied: timm in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 35)) (0.5.5)
Requirement already satisfied: dataclasses in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 36)) (0.6)
Requirement already satisfied: pytorch_lightning==1.1.4 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 37)) (1.1.4)
Requirement already satisfied: transformers==4.2.1 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 38)) (4.2.1)
Requirement already satisfied: torchvision==0.8.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 39)) (0.8.0)
Requirement already satisfied: torch==1.7.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 40)) (1.7.0)
Requirement already satisfied: tqdm==4.56.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 42)) (4.56.0)
Requirement already satisfied: ipdb==0.13.4 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 43)) (0.13.4)
Requirement already satisfied: numpy==1.20.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 44)) (1.20.0)
Requirement already satisfied: einops==0.3.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 45)) (0.3.0)
Requirement already satisfied: pyarrow==2.0.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 46)) (2.0.0)
Requirement already satisfied: sacred==0.8.2 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 47)) (0.8.2)
Requirement already satisfied: pandas==1.1.5 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 48)) (1.1.5)
Requirement already satisfied: numba in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 50)) (0.55.1)
Requirement already satisfied: kmeans_pytorch in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 51)) (0.3)
Requirement already satisfied: pycocotools==2.0.0 in /opt/conda/lib/python3.7/site-packages/pycocotools-2.0-py3.7-linux-x86_64.egg (from -r requirements.txt (line 52)) (2.0)
Requirement already satisfied: fairscale in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 53)) (0.4.2)
Requirement already satisfied: azure-common>=1.1.5 in /opt/conda/lib/python3.7/site-packages (from azure-storage-blob==2.1.0->-r requirements.txt (line 4)) (1.1.28)
Requirement already satisfied: azure-storage-common~=2.1 in /opt/conda/lib/python3.7/site-packages (from azure-storage-blob==2.1.0->-r requirements.txt (line 4)) (2.1.0)
Requirement already satisfied: fsspec[http]>=0.8.1 in /opt/conda/lib/python3.7/site-packages (from pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (2022.2.0)
Requirement already satisfied: tensorboard>=2.2.0 in /opt/conda/lib/python3.7/site-packages (from pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (2.8.0)
Requirement already satisfied: sacremoses in /opt/conda/lib/python3.7/site-packages (from transformers==4.2.1->-r requirements.txt (line 38)) (0.0.49)
Requirement already satisfied: requests in /opt/conda/lib/python3.7/site-packages (from transformers==4.2.1->-r requirements.txt (line 38)) (2.24.0)
Requirement already satisfied: importlib-metadata in /opt/conda/lib/python3.7/site-packages (from transformers==4.2.1->-r requirements.txt (line 38)) (4.11.3)
Requirement already satisfied: regex!=2019.12.17 in /opt/conda/lib/python3.7/site-packages (from transformers==4.2.1->-r requirements.txt (line 38)) (2022.3.15)
Requirement already satisfied: packaging in /opt/conda/lib/python3.7/site-packages (from transformers==4.2.1->-r requirements.txt (line 38)) (21.3)
Requirement already satisfied: filelock in /opt/conda/lib/python3.7/site-packages (from transformers==4.2.1->-r requirements.txt (line 38)) (3.6.0)
Requirement already satisfied: tokenizers==0.9.4 in /opt/conda/lib/python3.7/site-packages (from transformers==4.2.1->-r requirements.txt (line 38)) (0.9.4)
Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.7/site-packages (from torch==1.7.0->-r requirements.txt (line 40)) (3.7.4.3)
Requirement already satisfied: setuptools in /opt/conda/lib/python3.7/site-packages (from ipdb==0.13.4->-r requirements.txt (line 43)) (52.0.0.post20210125)
Requirement already satisfied: munch<3.0,>=2.0.2 in /opt/conda/lib/python3.7/site-packages (from sacred==0.8.2->-r requirements.txt (line 47)) (2.5.0)
Requirement already satisfied: py-cpuinfo>=4.0 in /opt/conda/lib/python3.7/site-packages (from sacred==0.8.2->-r requirements.txt (line 47)) (8.0.0)
Requirement already satisfied: wrapt<2.0,>=1.0 in /opt/conda/lib/python3.7/site-packages (from sacred==0.8.2->-r requirements.txt (line 47)) (1.14.0)
Requirement already satisfied: GitPython in /opt/conda/lib/python3.7/site-packages (from sacred==0.8.2->-r requirements.txt (line 47)) (3.1.27)
Requirement already satisfied: jsonpickle<2.0,>=1.2 in /opt/conda/lib/python3.7/site-packages (from sacred==0.8.2->-r requirements.txt (line 47)) (1.5.2)
Requirement already satisfied: docopt<1.0,>=0.3 in /opt/conda/lib/python3.7/site-packages (from sacred==0.8.2->-r requirements.txt (line 47)) (0.6.2)
Requirement already satisfied: colorama>=0.4 in /opt/conda/lib/python3.7/site-packages (from sacred==0.8.2->-r requirements.txt (line 47)) (0.4.4)
Requirement already satisfied: pytz>=2017.2 in /opt/conda/lib/python3.7/site-packages (from pandas==1.1.5->-r requirements.txt (line 48)) (2021.3)
Requirement already satisfied: python-dateutil>=2.7.3 in /opt/conda/lib/python3.7/site-packages/python_dateutil-2.8.1-py3.7.egg (from pandas==1.1.5->-r requirements.txt (line 48)) (2.8.1)
Requirement already satisfied: pickleshare in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (0.7.5)
Requirement already satisfied: matplotlib-inline in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (0.1.3)
Requirement already satisfied: jedi>=0.16 in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (0.18.1)
Requirement already satisfied: pygments in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (2.11.2)
Requirement already satisfied: decorator in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (5.1.1)
Requirement already satisfied: pexpect>4.3 in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (4.8.0)
Requirement already satisfied: backcall in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (0.2.0)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (3.0.28)
Requirement already satisfied: traitlets>=4.2 in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (5.1.1)
Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.7/site-packages/cycler-0.10.0-py3.7.egg (from matplotlib->-r requirements.txt (line 13)) (0.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/lib/python3.7/site-packages/kiwisolver-1.3.1-py3.7-linux-x86_64.egg (from matplotlib->-r requirements.txt (line 13)) (1.3.1)
Requirement already satisfied: pyparsing>=2.2.1 in /opt/conda/lib/python3.7/site-packages/pyparsing-3.0.0b2-py3.7.egg (from matplotlib->-r requirements.txt (line 13)) (3.0.0b2)
Requirement already satisfied: cryptography in /opt/conda/lib/python3.7/site-packages (from azure-storage-common~=2.1->azure-storage-blob==2.1.0->-r requirements.txt (line 4)) (3.4.7)
Requirement already satisfied: six in /opt/conda/lib/python3.7/site-packages (from cycler>=0.10->matplotlib->-r requirements.txt (line 13)) (1.15.0)
Requirement already satisfied: aiohttp in /opt/conda/lib/python3.7/site-packages (from fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (3.8.1)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in /opt/conda/lib/python3.7/site-packages (from jedi>=0.16->ipython->-r requirements.txt (line 10)) (0.8.3)
Requirement already satisfied: ptyprocess>=0.5 in /opt/conda/lib/python3.7/site-packages (from pexpect>4.3->ipython->-r requirements.txt (line 10)) (0.7.0)
Requirement already satisfied: wcwidth in /opt/conda/lib/python3.7/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython->-r requirements.txt (line 10)) (0.2.5)
Requirement already satisfied: google-auth<3,>=1.6.3 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (2.6.0)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (1.8.1)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (0.4.6)
Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (0.6.1)
Requirement already satisfied: grpcio>=1.24.3 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (1.44.0)
Requirement already satisfied: werkzeug>=0.11.15 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (2.0.3)
Requirement already satisfied: wheel>=0.26 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (0.35.1)
Requirement already satisfied: markdown>=2.6.8 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (3.3.6)
Requirement already satisfied: absl-py>=0.4 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (1.0.0)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /opt/conda/lib/python3.7/site-packages (from google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (0.2.8)
Requirement already satisfied: rsa<5,>=3.1.4 in /opt/conda/lib/python3.7/site-packages (from google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (4.8)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (5.0.0)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /opt/conda/lib/python3.7/site-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (1.3.1)
Requirement already satisfied: zipp>=0.5 in /opt/conda/lib/python3.7/site-packages (from importlib-metadata->transformers==4.2.1->-r requirements.txt (line 38)) (3.7.0)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /opt/conda/lib/python3.7/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (0.4.8)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/lib/python3.7/site-packages (from requests->transformers==4.2.1->-r requirements.txt (line 38)) (1.25.11)
Requirement already satisfied: idna<3,>=2.5 in /opt/conda/lib/python3.7/site-packages (from requests->transformers==4.2.1->-r requirements.txt (line 38)) (2.10)
Requirement already satisfied: chardet<4,>=3.0.2 in /opt/conda/lib/python3.7/site-packages (from requests->transformers==4.2.1->-r requirements.txt (line 38)) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.7/site-packages (from requests->transformers==4.2.1->-r requirements.txt (line 38)) (2021.5.30)
Requirement already satisfied: oauthlib>=3.0.0 in /opt/conda/lib/python3.7/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (3.2.0)
Requirement already satisfied: azure-servicemanagement-legacy~=0.20.6 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (0.20.7)
Requirement already satisfied: azure-keyvault~=0.3.3 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (0.3.7)
Requirement already satisfied: azure-storage~=0.34.2 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (0.34.3)
Requirement already satisfied: azure-datalake-store~=0.0.9 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (0.0.52)
Requirement already satisfied: azure-mgmt~=1.0.0 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (1.0.0)
Requirement already satisfied: azure-graphrbac~=0.30.0 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (0.30.0)
Requirement already satisfied: azure-servicebus~=0.21.1 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (0.21.1)
Requirement already satisfied: azure-batch~=3.0.0 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (3.0.0)
Requirement already satisfied: azure-servicefabric~=5.6.130 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (5.6.130)
Requirement already satisfied: msrestazure~=0.4.7 in /opt/conda/lib/python3.7/site-packages (from azure-batch~=3.0.0->azure->-r requirements.txt (line 3)) (0.4.34)
Requirement already satisfied: azure-nspkg>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-batch~=3.0.0->azure->-r requirements.txt (line 3)) (3.0.2)
Requirement already satisfied: cffi in /opt/conda/lib/python3.7/site-packages (from azure-datalake-store~=0.0.9->azure->-r requirements.txt (line 3)) (1.14.5)
Requirement already satisfied: adal>=0.4.2 in /opt/conda/lib/python3.7/site-packages (from azure-datalake-store~=0.0.9->azure->-r requirements.txt (line 3)) (1.2.7)
Requirement already satisfied: PyJWT<3,>=1.0.0 in /opt/conda/lib/python3.7/site-packages (from adal>=0.4.2->azure-datalake-store~=0.0.9->azure->-r requirements.txt (line 3)) (2.3.0)
Requirement already satisfied: azure-mgmt-compute~=1.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (1.0.0)
Requirement already satisfied: azure-mgmt-cognitiveservices~=1.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (1.0.0)
Requirement already satisfied: azure-mgmt-scheduler~=1.1.2 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (1.1.3)
Requirement already satisfied: azure-mgmt-documentdb~=0.1.3 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.1.3)
Requirement already satisfied: azure-mgmt-keyvault~=0.31.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.31.0)
Requirement already satisfied: azure-mgmt-web~=0.32.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.32.0)
Requirement already satisfied: azure-mgmt-batch~=4.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (4.0.0)
Requirement already satisfied: azure-mgmt-sql~=0.5.1 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.5.3)
Requirement already satisfied: azure-mgmt-dns~=1.0.1 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (1.0.1)
Requirement already satisfied: azure-mgmt-resource~=1.1.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (1.1.0)
Requirement already satisfied: azure-mgmt-iothub~=0.2.2 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.2.2)
Requirement already satisfied: azure-mgmt-network~=1.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (1.0.0)
Requirement already satisfied: azure-mgmt-datalake-analytics~=0.1.4 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.1.6)
Requirement already satisfied: azure-mgmt-trafficmanager~=0.30.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.30.0)
Requirement already satisfied: azure-mgmt-authorization~=0.30.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.30.0)
Requirement already satisfied: azure-mgmt-cdn~=0.30.3 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.30.3)
Requirement already satisfied: azure-mgmt-datalake-store~=0.1.4 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.1.6)
Requirement already satisfied: azure-mgmt-storage~=1.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (1.0.0)
Requirement already satisfied: azure-mgmt-redis~=4.1.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (4.1.1)
Requirement already satisfied: azure-mgmt-devtestlabs~=2.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (2.0.0)
Requirement already satisfied: azure-mgmt-monitor~=0.2.1 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.2.1)
Requirement already satisfied: azure-mgmt-containerregistry~=0.2.1 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.2.1)
Requirement already satisfied: azure-mgmt-rdbms~=0.1.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.1.0)
Requirement already satisfied: azure-mgmt-logic~=2.1.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (2.1.0)
Requirement already satisfied: azure-mgmt-nspkg>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt-authorization~=0.30.0->azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (3.0.2)
Requirement already satisfied: azure-mgmt-datalake-nspkg>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt-datalake-analytics~=0.1.4->azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (3.0.1)
Requirement already satisfied: pycparser in /opt/conda/lib/python3.7/site-packages (from cffi->azure-datalake-store~=0.0.9->azure->-r requirements.txt (line 3)) (2.20)
Requirement already satisfied: msrest<2.0.0,>=0.4.28 in /opt/conda/lib/python3.7/site-packages (from msrestazure~=0.4.7->azure-batch~=3.0.0->azure->-r requirements.txt (line 3)) (0.6.21)
Requirement already satisfied: keyring>=12.0.2 in /opt/conda/lib/python3.7/site-packages (from msrestazure~=0.4.7->azure-batch~=3.0.0->azure->-r requirements.txt (line 3)) (23.5.0)
Requirement already satisfied: SecretStorage>=3.2 in /opt/conda/lib/python3.7/site-packages (from keyring>=12.0.2->msrestazure~=0.4.7->azure-batch~=3.0.0->azure->-r requirements.txt (line 3)) (3.3.1)
Requirement already satisfied: jeepney>=0.4.2 in /opt/conda/lib/python3.7/site-packages (from keyring>=12.0.2->msrestazure~=0.4.7->azure-batch~=3.0.0->azure->-r requirements.txt (line 3)) (0.7.1)
Requirement already satisfied: isodate>=0.6.0 in /opt/conda/lib/python3.7/site-packages (from msrest<2.0.0,>=0.4.28->msrestazure~=0.4.7->azure-batch~=3.0.0->azure->-r requirements.txt (line 3)) (0.6.1)
Requirement already satisfied: sqlparse>=0.2.2 in /opt/conda/lib/python3.7/site-packages (from django->-r requirements.txt (line 6)) (0.4.2)
Requirement already satisfied: asgiref<4,>=3.3.2 in /opt/conda/lib/python3.7/site-packages (from django->-r requirements.txt (line 6)) (3.5.0)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.7/site-packages (from jinja2->-r requirements.txt (line 11)) (2.1.1)
Requirement already satisfied: networkx>=2.2 in /opt/conda/lib/python3.7/site-packages (from scikit-image->-r requirements.txt (line 12)) (2.6.3)
Requirement already satisfied: imageio>=2.4.1 in /opt/conda/lib/python3.7/site-packages (from scikit-image->-r requirements.txt (line 12)) (2.9.0)
Requirement already satisfied: tifffile>=2019.7.26 in /opt/conda/lib/python3.7/site-packages (from scikit-image->-r requirements.txt (line 12)) (2021.11.2)
Requirement already satisfied: scipy>=1.4.1 in /opt/conda/lib/python3.7/site-packages (from scikit-image->-r requirements.txt (line 12)) (1.7.3)
Requirement already satisfied: PyWavelets>=1.1.1 in /opt/conda/lib/python3.7/site-packages (from scikit-image->-r requirements.txt (line 12)) (1.3.0)
Requirement already satisfied: click in /opt/conda/lib/python3.7/site-packages (from nltk->-r requirements.txt (line 14)) (8.0.4)
Requirement already satisfied: joblib in /opt/conda/lib/python3.7/site-packages (from nltk->-r requirements.txt (line 14)) (1.1.0)
Requirement already satisfied: ppft>=1.6.6.4 in /opt/conda/lib/python3.7/site-packages (from pathos->-r requirements.txt (line 17)) (1.6.6.4)
Requirement already satisfied: pox>=0.3.0 in /opt/conda/lib/python3.7/site-packages (from pathos->-r requirements.txt (line 17)) (0.3.0)
Requirement already satisfied: dill>=0.3.4 in /opt/conda/lib/python3.7/site-packages (from pathos->-r requirements.txt (line 17)) (0.3.4)
Requirement already satisfied: multiprocess>=0.70.12 in /opt/conda/lib/python3.7/site-packages (from pathos->-r requirements.txt (line 17)) (0.70.12.2)
Requirement already satisfied: linecache2 in /opt/conda/lib/python3.7/site-packages (from traceback2->-r requirements.txt (line 26)) (1.0.0)
Requirement already satisfied: scikit-learn in /opt/conda/lib/python3.7/site-packages (from sklearn->-r requirements.txt (line 29)) (1.0.2)
Requirement already satisfied: s3transfer<0.6.0,>=0.5.0 in /opt/conda/lib/python3.7/site-packages (from boto3->-r requirements.txt (line 31)) (0.5.2)
Requirement already satisfied: botocore<1.25.0,>=1.24.20 in /opt/conda/lib/python3.7/site-packages (from boto3->-r requirements.txt (line 31)) (1.24.20)
Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /opt/conda/lib/python3.7/site-packages (from boto3->-r requirements.txt (line 31)) (0.10.0)
Requirement already satisfied: tensorboardX in /opt/conda/lib/python3.7/site-packages (from pytorch_lamb->-r requirements.txt (line 34)) (2.5)
Requirement already satisfied: llvmlite<0.39,>=0.38.0rc1 in /opt/conda/lib/python3.7/site-packages (from numba->-r requirements.txt (line 50)) (0.38.0)
Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (2.0.12)
Requirement already satisfied: frozenlist>=1.1.1 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (1.3.0)
Requirement already satisfied: yarl<2.0,>=1.0 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (1.7.2)
Requirement already satisfied: asynctest==0.13.0 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (0.13.0)
Requirement already satisfied: aiosignal>=1.1.2 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (1.2.0)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (4.0.2)
Requirement already satisfied: multidict<7.0,>=4.5 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (6.0.2)
Requirement already satisfied: attrs>=17.3.0 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (21.4.0)
Requirement already satisfied: gitdb<5,>=4.0.1 in /opt/conda/lib/python3.7/site-packages (from GitPython->sacred==0.8.2->-r requirements.txt (line 47)) (4.0.9)
Requirement already satisfied: smmap<6,>=3.0.1 in /opt/conda/lib/python3.7/site-packages (from gitdb<5,>=4.0.1->GitPython->sacred==0.8.2->-r requirements.txt (line 47)) (5.0.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from scikit-learn->sklearn->-r requirements.txt (line 29)) (3.1.0)
WARNING: Running pip as root will break packages and permissions. You should install packages reliably by using venv: https://pip.pypa.io/warnings/venv
+ pip install deprecated
Requirement already satisfied: deprecated in /opt/conda/lib/python3.7/site-packages (1.2.13)
Requirement already satisfied: wrapt<2,>=1.10 in /opt/conda/lib/python3.7/site-packages (from deprecated) (1.14.0)
WARNING: Running pip as root will break packages and permissions. You should install packages reliably by using venv: https://pip.pypa.io/warnings/venv
+ export NCCL_DEBUG=INFO
+ python -c import nltk; nltk.download("punkt")
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
+ python -c import nltk; nltk.download("averaged_perceptron_tagger")
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
+ sleep 5
03-16 04:42:17.890 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49    cmd_run(): start to cmd run: ls -llh
03-16 04:42:17.890 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): ls
03-16 04:42:17.890 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): -llh
total 4.3M
-rw-rw-r-- 1 root root 4.5K Feb  4  2021 aml_job_config.json
drwxr-xr-x 6 root root 4.0K Feb  5  2021 aux_data
-rw-rw-r-- 1 root root  24K Jul 26  2021 CLIPS.ipynb
-rw-rw-r-- 1 root root   20 Jun  9  2021 README.md
-rw-rw-r-- 1 root root  627 Mar 11 17:55 requirements.txt
drwxr-xr-x 2 root root 4.0K Apr 12  2021 scripts
drwxrwxr-x 5 root root 4.0K Sep 23 00:25 src
-rw-rw-r-- 1 root root  31K Nov 10 00:06 stats.pdf
-rw-rw-r-- 1 root root  17K Feb  1 23:51 T5_test.ipynb
drwxrwxr-x 4 root root 4.0K Dec 13  2020 tools
-rw-rw-r-- 1 root root 3.3M Nov 10 00:06 Untitled.ipynb
-rw-rw-r-- 1 root root  69K Nov  9 23:02 vinvl_label.json
-rw-rw-r-- 1 root root 577K Nov 16 22:18 Visualization.ipynb
-rw-rw-r-- 1 root root 3.4K Sep  2  2021 visualize.py
03-16 04:42:17.895 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49    cmd_run(): start to cmd run: pip freeze
03-16 04:42:17.895 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): pip
03-16 04:42:17.895 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): freeze
absl-py==1.0.0
adal==1.2.7
aiohttp==3.8.1
aiosignal==1.2.0
anytree==2.8.0
apex==0.1
asgiref==3.5.0
async-timeout==4.0.2
asynctest==0.13.0
attrs==21.4.0
azure==2.0.0
azure-batch==3.0.0
azure-common==1.1.28
azure-datalake-store==0.0.52
azure-graphrbac==0.30.0
azure-keyvault==0.3.7
azure-mgmt==1.0.0
azure-mgmt-authorization==0.30.0
azure-mgmt-batch==4.0.0
azure-mgmt-cdn==0.30.3
azure-mgmt-cognitiveservices==1.0.0
azure-mgmt-compute==1.0.0
azure-mgmt-containerregistry==0.2.1
azure-mgmt-datalake-analytics==0.1.6
azure-mgmt-datalake-nspkg==3.0.1
azure-mgmt-datalake-store==0.1.6
azure-mgmt-devtestlabs==2.0.0
azure-mgmt-dns==1.0.1
azure-mgmt-documentdb==0.1.3
azure-mgmt-iothub==0.2.2
azure-mgmt-keyvault==0.31.0
azure-mgmt-logic==2.1.0
azure-mgmt-monitor==0.2.1
azure-mgmt-network==1.0.0
azure-mgmt-nspkg==3.0.2
azure-mgmt-rdbms==0.1.0
azure-mgmt-redis==4.1.1
azure-mgmt-resource==1.1.0
azure-mgmt-scheduler==1.1.3
azure-mgmt-sql==0.5.3
azure-mgmt-storage==1.0.0
azure-mgmt-trafficmanager==0.30.0
azure-mgmt-web==0.32.0
azure-nspkg==3.0.2
azure-servicebus==0.21.1
azure-servicefabric==5.6.130
azure-servicemanagement-legacy==0.20.7
azure-storage==0.34.3
azure-storage-blob==2.1.0
azure-storage-common==2.1.0
backcall==0.2.0
boto3==1.21.20
botocore==1.24.20
brotlipy==0.7.0
cachetools==5.0.0
certifi==2021.5.30
cffi @ file:///tmp/build/80754af9/cffi_1613246939562/work
chardet @ file:///tmp/build/80754af9/chardet_1605303159953/work
charset-normalizer==2.0.12
click==8.0.4
colorama==0.4.4
conda==4.10.1
conda-package-handling @ file:///tmp/build/80754af9/conda-package-handling_1618262151086/work
cryptography @ file:///tmp/build/80754af9/cryptography_1616769182610/work
cycler==0.10.0
Cython==0.29.23
dataclasses==0.6
decorator==5.1.1
Deprecated==1.2.13
dill==0.3.4
Django==3.2.12
docopt==0.6.2
easydict==1.9
einops==0.3.0
ete3==3.1.2
fairscale==0.4.2
filelock==3.6.0
frozenlist==1.3.0
fsspec==2022.2.0
future==0.18.2
gitdb==4.0.9
GitPython==3.1.27
google-auth==2.6.0
google-auth-oauthlib==0.4.6
grpcio==1.44.0
idna @ file:///tmp/build/80754af9/idna_1593446292537/work
imageio==2.9.0
importlib-metadata==4.11.3
ipdb==0.13.4
ipython==7.32.0
isodate==0.6.1
jedi==0.18.1
jeepney==0.7.1
Jinja2==3.0.3
jmespath==0.10.0
joblib==1.1.0
jsonpickle==1.5.2
keyring==23.5.0
kiwisolver==1.3.1
kmeans-pytorch==0.3
linecache2==1.0.0
llvmlite==0.38.0
Markdown==3.3.6
MarkupSafe==2.1.1
matplotlib==3.4.2
matplotlib-inline==0.1.3
mkl-fft==1.3.0
mkl-random @ file:///tmp/build/80754af9/mkl_random_1618853974840/work
mkl-service==2.3.0
msrest==0.6.21
msrestazure==0.4.34
multidict==6.0.2
multiprocess==0.70.12.2
munch==2.5.0
networkx==2.6.3
ninja==1.10.0.post2
nltk==3.7
numba==0.55.1
numpy==1.20.0
oauthlib==3.2.0
olefile==0.46
opencv-python==4.5.5.64
orderedset==2.0.3
packaging==21.3
pandas==1.1.5
parso==0.8.3
pathos==0.2.8
pexpect==4.8.0
pickleshare==0.7.5
Pillow==8.1.0
pox==0.3.0
ppft==1.6.6.4
progressbar==2.5
prompt-toolkit==3.0.28
protobuf==3.19.4
psutil==5.9.0
ptyprocess==0.7.0
py-cpuinfo==8.0.0
pyarrow==2.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycocotools==2.0
pycosat==0.6.3
pycparser @ file:///tmp/build/80754af9/pycparser_1594388511720/work
Pygments==2.11.2
PyJWT==2.3.0
pymongo==4.0.2
pyOpenSSL @ file:///tmp/build/80754af9/pyopenssl_1605545627475/work
pyparsing==3.0.0b2
PySocks @ file:///tmp/build/80754af9/pysocks_1594394576006/work
python-dateutil==2.8.1
python-magic==0.4.25
pytorch-lamb==1.0.0
pytorch-lightning==1.1.4
pytz==2021.3
PyWavelets==1.3.0
PyYAML==5.4.1
regex==2022.3.15
requests @ file:///tmp/build/80754af9/requests_1592841827918/work
requests-oauthlib==1.3.1
rsa==4.8
ruamel-yaml-conda @ file:///tmp/build/80754af9/ruamel_yaml_1616016701961/work
s3transfer==0.5.2
sacred==0.8.2
sacremoses==0.0.49
scikit-image==0.19.2
scikit-learn==1.0.2
scipy==1.7.3
SecretStorage==3.3.1
simplejson==3.17.6
six @ file:///tmp/build/80754af9/six_1605205313296/work
sklearn==0.0
smmap==5.0.0
sqlparse==0.4.2
tb-nightly==2.9.0a20220313
tensorboard==2.8.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorboardX==2.5
threadpoolctl==3.1.0
tifffile==2021.11.2
timm @ git+https://github.com/rwightman/pytorch-image-models.git@7c67d6aca992f039eece0af5f7c29a43d48c00e4
tokenizers==0.9.4
torch==1.7.0
torchlars==0.1.2
torchvision==0.8.0
tqdm==4.56.0
traceback2==1.4.0
traitlets==5.1.1
transformers==4.2.1
typing-extensions @ file:///home/ktietz/src/ci_mi/typing_extensions_1612808209620/work
urllib3 @ file:///tmp/build/80754af9/urllib3_1603305693037/work
wcwidth==0.2.5
Werkzeug==2.0.3
wrapt==1.14.0
yacs==0.1.8
yarl==1.7.2
zipp==3.7.0
03-16 04:42:18.312 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:229   wrap_all(): python src/qd/pipeline.py -c ./aux_data/Jacob_config/new_result/coco-caption_no_vlp/classifier_Embedding/VinVL_Label_60_epoch_0.1_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml
03-16 04:42:18.320 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49    cmd_run(): start to cmd run: python src/qd/pipeline.py -c ./aux_data/Jacob_config/new_result/coco-caption_no_vlp/classifier_Embedding/VinVL_Label_60_epoch_0.1_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml
03-16 04:42:18.320 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): python
03-16 04:42:18.321 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): src/qd/pipeline.py
03-16 04:42:18.321 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): -c
03-16 04:42:18.321 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51    cmd_run(): ./aux_data/Jacob_config/new_result/coco-caption_no_vlp/classifier_Embedding/VinVL_Label_60_epoch_0.1_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml
03-16 04:42:18.321 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 04:42:18.321 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 04:42:19.000 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}]
2022-03-16 04:42:20,067.067 2829:qd_common.py:1742 setup_yaml(): python 3 env
2022-03-16 04:42:20,068.068 2829:qd_common.py:1105 parse_general_args(): loading parameter from ./aux_data/Jacob_config/new_result/coco-caption_no_vlp/classifier_Embedding/VinVL_Label_60_epoch_0.1_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml
2022-03-16 04:42:20,074.074 2829:pipeline.py:1365   <module>(): param:
{'all_test_data': [{'test_data': 'TaxCocoCaption', 'test_split': 'test'}],
 'param': {'add_od_labels': True,
           'base_lr': 0.0001,
           'basemodel': './output/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_20_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_bert_category/snapshot/model_iter_0081989.pt',
           'category': 'bert',
           'crop_pct': 1.0,
           'data': 'TaxCocoCaption',
           'drop_out': 0,
           'effective_batch_size': 512,
           'expid': 'Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb',
           'expid_prefix': 'CAPU',
           'force_predict': True,
           'force_train': True,
           'full_expid': 'Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb',
           'ignore_project_image': True,
           'image_encoder_pretrained': True,
           'image_encoder_type': 'VitEmb_vit_base_patch16_384',
           'img_feature_dim': 2054,
           'input_small_scale': 0.08,
           'log_step': 100,
           'loss': 'focal',
           'lr_multiplier': 0.1,
           'mask_type': 'seq2seq',
           'max_img_seq_length': 0,
           'max_iter': '60e',
           'max_seq_a_length': 20,
           'max_seq_length': 70,
           'monitor_after': True,
           'multi_crop': False,
           'multi_crop_scale': False,
           'multi_scale': False,
           'net': 'B',
           'od_label_conf': 0.2,
           'pad_to_max': True,
           'pipeline_type': {'from': 'src.qd.pipelines.tagger_caption_uni_pipeline_expanding',
                             'import': 'CaptionUniPipeline'},
           'split_blocks': 4,
           'tagemb': 'cls',
           'test_batch_size': 16,
           'test_crop_size': 384,
           'text_encoder_type': './aux_data/untrained_config/VILT-L12-H784-uncased_16_384',
           'tokenizer_file': 'vinvl_label.json',
           'topk': 50,
           'train_crop_size': 384,
           'train_label_version': 'vinvl',
           'train_transform': 'vit',
           'use_amp': False,
           'use_img_layernorm': False,
           'weight_decay': 0.05},
 'type': 'pipeline_train_eval_multi'}
2022-03-16 04:42:20,235.235 2829:qd_common.py:3452 print_frame_info(): func name = pipeline_train_eval_multi; all_test_data = [{'test_data': 'TaxCocoCaption', 'test_split': 'test'}]; param = {'data': 'TaxCocoCaption', 'drop_out': 0, 'net': 'B', 'mask_type': 'seq2seq', 'tokenizer_file': 'vinvl_label.json', 'topk': 50, 'basemodel': './output/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_20_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_bert_category/snapshot/model_iter_0081989.pt', 'text_encoder_type': './aux_data/untrained_config/VILT-L12-H784-uncased_16_384', 'image_encoder_type': 'VitEmb_vit_base_patch16_384', 'crop_pct': 1.0, 'base_lr': 0.0001, 'split_blocks': 4, 'lr_multiplier': 0.1, 'monitor_after': True, 'test_crop_size': 384, 'train_crop_size': 384, 'multi_scale': False, 'multi_crop': False, 'multi_crop_scale': False, 'train_transform': 'vit', 'use_img_layernorm': False, 'expid': 'Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb', 'full_expid': 'Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb', 'image_encoder_pretrained': True, 'expid_prefix': 'CAPU', 'pad_to_max': True, 'add_od_labels': True, 'effective_batch_size': 512, 'test_batch_size': 16, 'max_iter': '60e', 'ignore_project_image': True, 'input_small_scale': 0.08, 'log_step': 100, 'weight_decay': 0.05, 'use_amp': False, 'tagemb': 'cls', 'max_img_seq_length': 0, 'od_label_conf': 0.2, 'max_seq_length': 70, 'max_seq_a_length': 20, 'img_feature_dim': 2054, 'train_label_version': 'vinvl', 'loss': 'focal', 'category': 'bert', 'force_train': True, 'force_predict': True, 'pipeline_type': {'from': 'src.qd.pipelines.tagger_caption_uni_pipeline_expanding', 'import': 'CaptionUniPipeline'}}
2022-03-16 04:42:20,235.235 2829:qd_common.py:1742 setup_yaml(): python 3 env
2022-03-16 04:42:26,690.690 2829:torch_common.py:408 ensure_init_process_group(): {'backend': 'nccl', 'init_method': 'tcp://10.0.0.8:12345', 'rank': 0, 'world_size': 32, 'timeout': datetime.timedelta(days=10)}
42c07f8197104c3b988e50758ff54da200000C:2829:2829 [0] NCCL INFO Bootstrap : Using [0]eth0:10.0.0.8<0>
42c07f8197104c3b988e50758ff54da200000C:2829:2829 [0] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
42c07f8197104c3b988e50758ff54da200000C:2829:2829 [0] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
libibverbs: Warning: couldn't load driver 'libvmw_pvrdma-rdmav25.so': libvmw_pvrdma-rdmav25.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'libmthca-rdmav25.so': libmthca-rdmav25.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'libhfi1verbs-rdmav25.so': libhfi1verbs-rdmav25.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'libi40iw-rdmav25.so': libi40iw-rdmav25.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'libqedr-rdmav25.so': libqedr-rdmav25.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'libcxgb4-rdmav25.so': libcxgb4-rdmav25.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'libocrdma-rdmav25.so': libocrdma-rdmav25.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'libipathverbs-rdmav25.so': libipathverbs-rdmav25.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'libhns-rdmav25.so': libhns-rdmav25.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'libbnxt_re-rdmav25.so': libbnxt_re-rdmav25.so: cannot open shared object file: No such file or directory
42c07f8197104c3b988e50758ff54da200000C:2829:2829 [0] NCCL INFO NET/IB : Using [0]mlx5_ib0:1/IB ; OOB eth0:10.0.0.8<0>
42c07f8197104c3b988e50758ff54da200000C:2829:2829 [0] NCCL INFO Using network IB
NCCL version 2.7.8+cuda10.2
42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO NCCL_IB_TIMEOUT set by environment to 32.
42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 00/02 :    0   1   2   4   7   6   5   3   8   9  10  12  15  14  13  11  16  17  18  20
42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 01/02 :    0   1   2   4   7   6   5   3   8   9  10  12  15  14  13  11  16  17  18  20
42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 8/8/64
42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1|-1->0->1/-1/-1 [1] 1/-1/-1->0->25|25->0->1/-1/-1
42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Setting affinity for GPU 0 to 0fffff
42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 00 : 27[400000] -> 0[100000] [receive] via NET/IB/0
42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 00 : 0[100000] -> 1[200000] via P2P/IPC
42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 01 : 27[400000] -> 0[100000] [receive] via NET/IB/0
42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 01 : 0[100000] -> 1[200000] via P2P/IPC
42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 01 : 0[100000] -> 25[200000] [send] via NET/IB/0
42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 01 : 25[200000] -> 0[100000] [receive] via NET/IB/0
42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer
42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO comm 0x7fcb40001060 rank 0 nranks 32 cudaDev 0 busId 100000 - Init COMPLETE
42c07f8197104c3b988e50758ff54da200000C:2829:2829 [0] NCCL INFO Launch mode Parallel
2022-03-16 04:43:35,149.149 2829:uni_pipeline.py:841 _ensure_initialized(): initialized
2022-03-16 04:43:35,157.157 2829:uni_pipeline.py:534 ensure_train(): last model file = output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt
2022-03-16 04:43:35,521.521 2829:uni_pipeline.py:542 ensure_train(): {'add_od_labels': True,
 'apply_nms_gt': True,
 'base_lr': 0.0001,
 'basemodel': './output/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_20_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_bert_category/snapshot/model_iter_0081989.pt',
 'bgr2rgb': False,
 'bias_no_weight_decay': True,
 'category': 'bert',
 'cider_cached_tokens': 'data/coco_caption/gt/coco-train-words.p',
 'coco_eval_max_det': 100,
 'cosine_restart_after_warmup': True,
 'cosine_warmup_factor': 0.3333333333333333,
 'cosine_warmup_iters': 500,
 'crop_pct': 1.0,
 'cudnn_benchmark': False,
 'cutout_factor': 4,
 'data': 'TaxCocoCaption',
 'device': 'cuda',
 'dist_backend': 'nccl',
 'dist_url_tcp_port': 12345,
 'dist_weight': 1.0,
 'drop_out': 0,
 'effective_batch_size': 512,
 'evaluate_method': 'map',
 'expid': 'Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb',
 'expid_prefix': 'CAPU',
 'find_unused_parameters': True,
 'force_predict': True,
 'force_train': True,
 'full_expid': 'Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb',
 'gradient_clip': 1.0,
 'ignore_project_image': True,
 'image_encoder_pretrained': True,
 'image_encoder_type': 'VitEmb_vit_base_patch16_384',
 'img_feature_dim': 2054,
 'img_layer_norm_eps': 1e-05,
 'init_method_type': 'tcp',
 'input_small_scale': 0.08,
 'label_smoothing': 0.1,
 'ln_no_weight_decay': True,
 'log_step': 100,
 'loss': 'focal',
 'lr_multiplier': 0.1,
 'mask_prob': 0.15,
 'mask_type': 'seq2seq',
 'max_gen_length': 20,
 'max_img_seq_length': 0,
 'max_iter': '60e',
 'max_masked_tokens': 3,
 'max_seq_a_length': 20,
 'max_seq_length': 70,
 'min_rel_lr_in_cosine': 0.0,
 'mobilenetv3_dropout_ratio': 0.2,
 'momentum': 0.9,
 'monitor_after': True,
 'multi_crop': False,
 'multi_crop_scale': False,
 'multi_scale': False,
 'net': 'B',
 'no_sort_by_conf': False,
 'num_beams': 1,
 'num_workers': 8,
 'od_label_conf': 0.2,
 'od_label_conf ': 0.2,
 'optimizer_type': 'MAdamW',
 'output_isvalid': False,
 'ovthresh': [-1],
 'pad_to_max': True,
 'pert_img_prob': None,
 'pipeline_type': {'from': 'src.qd.pipelines.tagger_caption_uni_pipeline_expanding',
                   'import': 'CaptionUniPipeline'},
 'pred_tsv_to_json_extra': 1,
 'random_seed': 88,
 'real_text_a_in_test': False,
 'replace_by_mask_prob': 0.8,
 'replace_by_rand_prob': 0.1,
 'rms_alpha': 0.99,
 'scheduler_type': 'linear',
 'smooth_label_eps': 0.1,
 'snapshot_steps': 5000,
 'split_blocks': 4,
 'splitbysplitsample_buffer_size': 1,
 'splitbysplitsample_group_size': 1,
 'step_lr': 30,
 'tagemb': 'cls',
 'temperature': 1,
 'test_batch_size': 16,
 'test_crop_size': 384,
 'test_data': 'TaxCocoCaption',
 'test_mergebn': False,
 'test_split': 'test',
 'text_encoder_type': './aux_data/untrained_config/VILT-L12-H784-uncased_16_384',
 'tie_weights': True,
 'tokenizer_file': 'vinvl_label.json',
 'top_k': 0,
 'top_p': 1,
 'topk': 50,
 'train_crop_size': 384,
 'train_label_version': 'vinvl',
 'train_shuffle': True,
 'train_transform': 'vit',
 'unique_labels_on': False,
 'use_amp': False,
 'use_img_layernorm': False,
 'warmup_steps': 0,
 'weight_decay': 0.05}
2022-03-16 04:43:35,522.522 2829:uni_pipeline.py:545 ensure_train(): torch info = {'cuda': '10.2',
 'cudnn': 7605,
 'current_device': 0,
 'device_count': 8,
 'nccl': 2708,
 'version': '1.7.0'}
2022-03-16 04:43:35,621.621 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json
2022-03-16 04:43:35,621.621 2829:modeling_utils.py:211 from_pretrained(): Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "image_captioning",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "TIMM_vit",
  "net": "vit_base_patch16_384",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pretrained": true,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-03-16 04:43:35,624.624 2829:tokenization_utils.py:170 _from_pretrained(): Model name './aux_data/untrained_config/VILT-L12-H784-uncased_16_384' not found in model shortcut name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc). Assuming './aux_data/untrained_config/VILT-L12-H784-uncased_16_384' is a path or url to a directory containing tokenizer files.
2022-03-16 04:43:35,624.624 2829:tokenization_utils.py:180 _from_pretrained(): Didn't find file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/added_tokens.json. We won't load it.
2022-03-16 04:43:35,624.624 2829:tokenization_utils.py:180 _from_pretrained(): Didn't find file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/special_tokens_map.json. We won't load it.
2022-03-16 04:43:35,624.624 2829:tokenization_utils.py:214 _from_pretrained(): loading file None
2022-03-16 04:43:35,624.624 2829:tokenization_utils.py:214 _from_pretrained(): loading file None
2022-03-16 04:43:35,624.624 2829:tokenization_utils.py:214 _from_pretrained(): loading file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/vocab.txt
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 620, in pipeline_train_eval_multi
    pip.ensure_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 550, in ensure_train
    train_result = self.train()
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 605, in train
    model = self.get_model(is_train=True)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 620, in pipeline_train_eval_multi
    pip.ensure_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 550, in ensure_train
    train_result = self.train()
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 605, in train
    model = self.get_model(is_train=True)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
2022-03-16 04:43:36,373.373 2829:modeling_bert.py:529   __init__(): TIMM Split image encoder load from pre-trained: True
2022-03-16 04:43:37,526.526 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-16 04:43:42,387.387 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-16 04:43:43,023.023 2829:modeling_bert.py:2677   __init__(): BertImgModel Image Dimension: 2054
2022-03-16 04:43:43,176.176 2829:modeling_bert.py:2688   __init__(): BertImgModel Image Dimension: 2054
2022-03-16 04:43:43,972.972 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight!  Note that this might be replaced by pre-trained checkpoint later!
2022-03-16 04:43:43,972.972 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode.
2022-03-16 04:43:45,257.257 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-16 04:43:46,254.254 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): module.cls_token: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,255.255 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): module.pos_embed: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,255.255 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): module.patch_embed.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,255.255 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): module.patch_embed.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,255.255 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): module.head.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,255.255 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): module.head.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,255.255 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): word_embeddings.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,256.256 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): position_embeddings.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,256.256 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): token_type_embeddings.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,256.256 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): LayerNorm.weight: lr = 0.0001; weight_decay = 0
2022-03-16 04:43:46,256.256 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): LayerNorm.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,256.256 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,256.256 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,256.256 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.proj.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.proj.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): dense.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): dense.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): dense.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): dense.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): predictions.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): predictions.transform.dense.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): predictions.transform.dense.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): predictions.transform.LayerNorm.weight: lr = 0.0001; weight_decay = 0
2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): predictions.transform.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): predictions.decoder.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.self.query.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.self.query.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.self.key.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.self.key.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.self.value.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.self.value.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.output.dense.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.output.dense.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0
2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.intermediate.dense.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.intermediate.dense.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.output.dense.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.output.dense.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0
2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.self.query.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.self.query.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.self.key.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.self.key.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.self.value.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.self.value.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.output.dense.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.output.dense.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0
2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.intermediate.dense.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.intermediate.dense.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.output.dense.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.output.dense.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0
2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.self.query.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.self.query.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.self.key.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.self.key.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.self.value.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.self.value.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.output.dense.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.output.dense.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0
2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.intermediate.dense.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.intermediate.dense.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.output.dense.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.output.dense.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0
2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.self.query.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.self.query.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.self.key.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.self.key.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.self.value.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.self.value.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.output.dense.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.output.dense.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0
2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.intermediate.dense.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.intermediate.dense.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.output.dense.weight: lr = 0.0001; weight_decay = 0.05
2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.output.dense.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,283.283 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0
2022-03-16 04:43:46,283.283 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0
2022-03-16 04:43:46,283.283 2829:tagger_caption_uni_pipeline_expanding.py:692 get_optimizer(): LR Updating...
learning rate 0.0001, <class 'float'>
2022-03-16 04:43:46,292.292 2829:tagger_caption_uni_pipeline_expanding.py:608      train(): AdamW (
Parameter Group 0
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 0.0001
    param_names: ['module.cls_token', 'module.pos_embed', 'module.patch_embed.proj.weight', 'module.head.weight']
    weight_decay: 0.05

Parameter Group 1
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 0.0001
    param_names: ['module.patch_embed.proj.bias', 'module.head.bias']
    weight_decay: 0.0

Parameter Group 2
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 0.0001
    param_names: ['word_embeddings.weight', 'position_embeddings.weight', 'token_type_embeddings.weight']
    weight_decay: 0.05

Parameter Group 3
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 0.0001
    param_names: ['LayerNorm.weight', 'LayerNorm.bias']
    weight_decay: 0

Parameter Group 4
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 1e-05
    param_names: ['0.norm1.weight', '0.attn.qkv.weight', '0.attn.proj.weight', '0.norm2.weight', '0.mlp.fc1.weight', '0.mlp.fc2.weight', '1.norm1.weight', '1.attn.qkv.weight', '1.attn.proj.weight', '1.norm2.weight', '1.mlp.fc1.weight', '1.mlp.fc2.weight', '2.norm1.weight', '2.attn.qkv.weight', '2.attn.proj.weight', '2.norm2.weight', '2.mlp.fc1.weight', '2.mlp.fc2.weight', '3.norm1.weight', '3.attn.qkv.weight', '3.attn.proj.weight', '3.norm2.weight', '3.mlp.fc1.weight', '3.mlp.fc2.weight', '4.norm1.weight', '4.attn.qkv.weight', '4.attn.proj.weight', '4.norm2.weight', '4.mlp.fc1.weight', '4.mlp.fc2.weight', '5.norm1.weight', '5.attn.qkv.weight', '5.attn.proj.weight', '5.norm2.weight', '5.mlp.fc1.weight', '5.mlp.fc2.weight', '6.norm1.weight', '6.attn.qkv.weight', '6.attn.proj.weight', '6.norm2.weight', '6.mlp.fc1.weight', '6.mlp.fc2.weight', '7.norm1.weight', '7.attn.qkv.weight', '7.attn.proj.weight', '7.norm2.weight', '7.mlp.fc1.weight', '7.mlp.fc2.weight']
    weight_decay: 0.05

Parameter Group 5
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 1e-05
    param_names: ['0.norm1.bias', '0.attn.qkv.bias', '0.attn.proj.bias', '0.norm2.bias', '0.mlp.fc1.bias', '0.mlp.fc2.bias', '1.norm1.bias', '1.attn.qkv.bias', '1.attn.proj.bias', '1.norm2.bias', '1.mlp.fc1.bias', '1.mlp.fc2.bias', '2.norm1.bias', '2.attn.qkv.bias', '2.attn.proj.bias', '2.norm2.bias', '2.mlp.fc1.bias', '2.mlp.fc2.bias', '3.norm1.bias', '3.attn.qkv.bias', '3.attn.proj.bias', '3.norm2.bias', '3.mlp.fc1.bias', '3.mlp.fc2.bias', '4.norm1.bias', '4.attn.qkv.bias', '4.attn.proj.bias', '4.norm2.bias', '4.mlp.fc1.bias', '4.mlp.fc2.bias', '5.norm1.bias', '5.attn.qkv.bias', '5.attn.proj.bias', '5.norm2.bias', '5.mlp.fc1.bias', '5.mlp.fc2.bias', '6.norm1.bias', '6.attn.qkv.bias', '6.attn.proj.bias', '6.norm2.bias', '6.mlp.fc1.bias', '6.mlp.fc2.bias', '7.norm1.bias', '7.attn.qkv.bias', '7.attn.proj.bias', '7.norm2.bias', '7.mlp.fc1.bias', '7.mlp.fc2.bias']
    weight_decay: 0.0

Parameter Group 6
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 0.0001
    param_names: ['0.norm1.weight', '0.attn.qkv.weight', '0.attn.proj.weight', '0.norm2.weight', '0.mlp.fc1.weight', '0.mlp.fc2.weight', '1.norm1.weight', '1.attn.qkv.weight', '1.attn.proj.weight', '1.norm2.weight', '1.mlp.fc1.weight', '1.mlp.fc2.weight', '2.norm1.weight', '2.attn.qkv.weight', '2.attn.proj.weight', '2.norm2.weight', '2.mlp.fc1.weight', '2.mlp.fc2.weight', '3.norm1.weight', '3.attn.qkv.weight', '3.attn.proj.weight', '3.norm2.weight', '3.mlp.fc1.weight', '3.mlp.fc2.weight']
    weight_decay: 0.05

Parameter Group 7
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 0.0001
    param_names: ['0.norm1.bias', '0.attn.qkv.bias', '0.attn.proj.bias', '0.norm2.bias', '0.mlp.fc1.bias', '0.mlp.fc2.bias', '1.norm1.bias', '1.attn.qkv.bias', '1.attn.proj.bias', '1.norm2.bias', '1.mlp.fc1.bias', '1.mlp.fc2.bias', '2.norm1.bias', '2.attn.qkv.bias', '2.attn.proj.bias', '2.norm2.bias', '2.mlp.fc1.bias', '2.mlp.fc2.bias', '3.norm1.bias', '3.attn.qkv.bias', '3.attn.proj.bias', '3.norm2.bias', '3.mlp.fc1.bias', '3.mlp.fc2.bias']
    weight_decay: 0.0

Parameter Group 8
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 1e-05
    param_names: ['0.norm1.weight', '0.attn.qkv.weight', '0.attn.proj.weight', '0.norm2.weight', '0.mlp.fc1.weight', '0.mlp.fc2.weight', '1.norm1.weight', '1.attn.qkv.weight', '1.attn.proj.weight', '1.norm2.weight', '1.mlp.fc1.weight', '1.mlp.fc2.weight', '2.norm1.weight', '2.attn.qkv.weight', '2.attn.proj.weight', '2.norm2.weight', '2.mlp.fc1.weight', '2.mlp.fc2.weight', '3.norm1.weight', '3.attn.qkv.weight', '3.attn.proj.weight', '3.norm2.weight', '3.mlp.fc1.weight', '3.mlp.fc2.weight']
    weight_decay: 0.05

Parameter Group 9
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 1e-05
    param_names: ['0.norm1.bias', '0.attn.qkv.bias', '0.attn.proj.bias', '0.norm2.bias', '0.mlp.fc1.bias', '0.mlp.fc2.bias', '1.norm1.bias', '1.attn.qkv.bias', '1.attn.proj.bias', '1.norm2.bias', '1.mlp.fc1.bias', '1.mlp.fc2.bias', '2.norm1.bias', '2.attn.qkv.bias', '2.attn.proj.bias', '2.norm2.bias', '2.mlp.fc1.bias', '2.mlp.fc2.bias', '3.norm1.bias', '3.attn.qkv.bias', '3.attn.proj.bias', '3.norm2.bias', '3.mlp.fc1.bias', '3.mlp.fc2.bias']
    weight_decay: 0.0

Parameter Group 10
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 0.0001
    param_names: ['dense.weight']
    weight_decay: 0.05

Parameter Group 11
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 0.0001
    param_names: ['dense.bias']
    weight_decay: 0.0

Parameter Group 12
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 1e-05
    param_names: ['dense.weight']
    weight_decay: 0.05

Parameter Group 13
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 1e-05
    param_names: ['dense.bias']
    weight_decay: 0.0

Parameter Group 14
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 1e-05
    param_names: ['predictions.bias', 'predictions.transform.dense.bias', 'predictions.transform.LayerNorm.weight', 'predictions.transform.LayerNorm.bias']
    weight_decay: 0.0

Parameter Group 15
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 1e-05
    param_names: ['predictions.transform.dense.weight', 'predictions.decoder.weight']
    weight_decay: 0.05

Parameter Group 16
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 0.0001
    param_names: ['layer.0.attention.self.query.weight', 'layer.0.attention.self.key.weight', 'layer.0.attention.self.value.weight', 'layer.0.attention.output.dense.weight', 'layer.0.intermediate.dense.weight', 'layer.0.output.dense.weight', 'layer.1.attention.self.query.weight', 'layer.1.attention.self.key.weight', 'layer.1.attention.self.value.weight', 'layer.1.attention.output.dense.weight', 'layer.1.intermediate.dense.weight', 'layer.1.output.dense.weight', 'layer.2.attention.self.query.weight', 'layer.2.attention.self.key.weight', 'layer.2.attention.self.value.weight', 'layer.2.attention.output.dense.weight', 'layer.2.intermediate.dense.weight', 'layer.2.output.dense.weight', 'layer.3.attention.self.query.weight', 'layer.3.attention.self.key.weight', 'layer.3.attention.self.value.weight', 'layer.3.attention.output.dense.weight', 'layer.3.intermediate.dense.weight', 'layer.3.output.dense.weight']
    weight_decay: 0.05

Parameter Group 17
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    lr: 0.0001
    param_names: ['layer.0.attention.self.query.bias', 'layer.0.attention.self.key.bias', 'layer.0.attention.self.value.bias', 'layer.0.attention.output.dense.bias', 'layer.0.attention.output.LayerNorm.weight', 'layer.0.attention.output.LayerNorm.bias', 'layer.0.intermediate.dense.bias', 'layer.0.output.dense.bias', 'layer.0.output.LayerNorm.weight', 'layer.0.output.LayerNorm.bias', 'layer.1.attention.self.query.bias', 'layer.1.attention.self.key.bias', 'layer.1.attention.self.value.bias', 'layer.1.attention.output.dense.bias', 'layer.1.attention.output.LayerNorm.weight', 'layer.1.attention.output.LayerNorm.bias', 'layer.1.intermediate.dense.bias', 'layer.1.output.dense.bias', 'layer.1.output.LayerNorm.weight', 'layer.1.output.LayerNorm.bias', 'layer.2.attention.self.query.bias', 'layer.2.attention.self.key.bias', 'layer.2.attention.self.value.bias', 'layer.2.attention.output.dense.bias', 'layer.2.attention.output.LayerNorm.weight', 'layer.2.attention.output.LayerNorm.bias', 'layer.2.intermediate.dense.bias', 'layer.2.output.dense.bias', 'layer.2.output.LayerNorm.weight', 'layer.2.output.LayerNorm.bias', 'layer.3.attention.self.query.bias', 'layer.3.attention.self.key.bias', 'layer.3.attention.self.value.bias', 'layer.3.attention.output.dense.bias', 'layer.3.attention.output.LayerNorm.weight', 'layer.3.attention.output.LayerNorm.bias', 'layer.3.intermediate.dense.bias', 'layer.3.output.dense.bias', 'layer.3.output.LayerNorm.weight', 'layer.3.output.LayerNorm.bias']
    weight_decay: 0.0
)
2022-03-16 04:43:46,459.459 2829:checkpoint.py:240       load(): Loading checkpoint from ./output/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_20_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_bert_category/snapshot/model_iter_0081989.pt
2022-03-16 04:43:58,590.590 2829:checkpoint.py:99 align_and_update_state_dicts(): module.image_encoder.module.cls_token                                loaded from image_encoder.module.cls_token                        of shape (1, 1, 768)
2022-03-16 04:43:58,590.590 2829:checkpoint.py:99 align_and_update_state_dicts(): module.image_encoder.module.head.bias                                loaded from image_encoder.module.head.bias                        of shape (1000,)
2022-03-16 04:43:58,590.590 2829:checkpoint.py:99 align_and_update_state_dicts(): module.image_encoder.module.head.weight                              loaded from image_encoder.module.head.weight                      of shape (1000, 768)
2022-03-16 04:43:58,590.590 2829:checkpoint.py:99 align_and_update_state_dicts(): module.image_encoder.module.patch_embed.proj.bias                    loaded from image_encoder.module.patch_embed.proj.bias            of shape (768,)
2022-03-16 04:43:58,590.590 2829:checkpoint.py:99 align_and_update_state_dicts(): module.image_encoder.module.patch_embed.proj.weight                  loaded from image_encoder.module.patch_embed.proj.weight          of shape (768, 3, 16, 16)
2022-03-16 04:43:58,590.590 2829:checkpoint.py:99 align_and_update_state_dicts(): module.image_encoder.module.pos_embed                                loaded from image_encoder.module.pos_embed                        of shape (1, 577, 768)
2022-03-16 04:43:58,590.590 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.attn.proj.bias                   loaded from bert.encoder.blocks.0.attn.proj.bias                  of shape (768,)
2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.attn.proj.weight                 loaded from bert.encoder.blocks.0.attn.proj.weight                of shape (768, 768)
2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.attn.qkv.bias                    loaded from bert.encoder.blocks.0.attn.qkv.bias                   of shape (2304,)
2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.attn.qkv.weight                  loaded from bert.encoder.blocks.0.attn.qkv.weight                 of shape (2304, 768)
2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.mlp.fc1.bias                     loaded from bert.encoder.blocks.0.mlp.fc1.bias                    of shape (3072,)
2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.mlp.fc1.weight                   loaded from bert.encoder.blocks.0.mlp.fc1.weight                  of shape (3072, 768)
2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.mlp.fc2.bias                     loaded from bert.encoder.blocks.0.mlp.fc2.bias                    of shape (768,)
2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.mlp.fc2.weight                   loaded from bert.encoder.blocks.0.mlp.fc2.weight                  of shape (768, 3072)
2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.norm1.bias                       loaded from bert.encoder.blocks.0.norm1.bias                      of shape (768,)
2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.norm1.weight                     loaded from bert.encoder.blocks.0.norm1.weight                    of shape (768,)
2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.norm2.bias                       loaded from bert.encoder.blocks.0.norm2.bias                      of shape (768,)
2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.norm2.weight                     loaded from bert.encoder.blocks.0.norm2.weight                    of shape (768,)
2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.attn.proj.bias                   loaded from bert.encoder.blocks.1.attn.proj.bias                  of shape (768,)
2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.attn.proj.weight                 loaded from bert.encoder.blocks.1.attn.proj.weight                of shape (768, 768)
2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.attn.qkv.bias                    loaded from bert.encoder.blocks.1.attn.qkv.bias                   of shape (2304,)
2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.attn.qkv.weight                  loaded from bert.encoder.blocks.1.attn.qkv.weight                 of shape (2304, 768)
2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.mlp.fc1.bias                     loaded from bert.encoder.blocks.1.mlp.fc1.bias                    of shape (3072,)
2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.mlp.fc1.weight                   loaded from bert.encoder.blocks.1.mlp.fc1.weight                  of shape (3072, 768)
2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.mlp.fc2.bias                     loaded from bert.encoder.blocks.1.mlp.fc2.bias                    of shape (768,)
2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.mlp.fc2.weight                   loaded from bert.encoder.blocks.1.mlp.fc2.weight                  of shape (768, 3072)
2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.norm1.bias                       loaded from bert.encoder.blocks.1.norm1.bias                      of shape (768,)
2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.norm1.weight                     loaded from bert.encoder.blocks.1.norm1.weight                    of shape (768,)
2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.norm2.bias                       loaded from bert.encoder.blocks.1.norm2.bias                      of shape (768,)
2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.norm2.weight                     loaded from bert.encoder.blocks.1.norm2.weight                    of shape (768,)
2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.attn.proj.bias                  loaded from bert.encoder.blocks.10.attn.proj.bias                 of shape (768,)
2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.attn.proj.weight                loaded from bert.encoder.blocks.10.attn.proj.weight               of shape (768, 768)
2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.attn.qkv.bias                   loaded from bert.encoder.blocks.10.attn.qkv.bias                  of shape (2304,)
2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.attn.qkv.weight                 loaded from bert.encoder.blocks.10.attn.qkv.weight                of shape (2304, 768)
2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.mlp.fc1.bias                    loaded from bert.encoder.blocks.10.mlp.fc1.bias                   of shape (3072,)
2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.mlp.fc1.weight                  loaded from bert.encoder.blocks.10.mlp.fc1.weight                 of shape (3072, 768)
2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.mlp.fc2.bias                    loaded from bert.encoder.blocks.10.mlp.fc2.bias                   of shape (768,)
2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.mlp.fc2.weight                  loaded from bert.encoder.blocks.10.mlp.fc2.weight                 of shape (768, 3072)
2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.norm1.bias                      loaded from bert.encoder.blocks.10.norm1.bias                     of shape (768,)
2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.norm1.weight                    loaded from bert.encoder.blocks.10.norm1.weight                   of shape (768,)
2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.norm2.bias                      loaded from bert.encoder.blocks.10.norm2.bias                     of shape (768,)
2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.norm2.weight                    loaded from bert.encoder.blocks.10.norm2.weight                   of shape (768,)
2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.attn.proj.bias                  loaded from bert.encoder.blocks.11.attn.proj.bias                 of shape (768,)
2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.attn.proj.weight                loaded from bert.encoder.blocks.11.attn.proj.weight               of shape (768, 768)
2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.attn.qkv.bias                   loaded from bert.encoder.blocks.11.attn.qkv.bias                  of shape (2304,)
2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.attn.qkv.weight                 loaded from bert.encoder.blocks.11.attn.qkv.weight                of shape (2304, 768)
2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.mlp.fc1.bias                    loaded from bert.encoder.blocks.11.mlp.fc1.bias                   of shape (3072,)
2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.mlp.fc1.weight                  loaded from bert.encoder.blocks.11.mlp.fc1.weight                 of shape (3072, 768)
2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.mlp.fc2.bias                    loaded from bert.encoder.blocks.11.mlp.fc2.bias                   of shape (768,)
2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.mlp.fc2.weight                  loaded from bert.encoder.blocks.11.mlp.fc2.weight                 of shape (768, 3072)
2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.norm1.bias                      loaded from bert.encoder.blocks.11.norm1.bias                     of shape (768,)
2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.norm1.weight                    loaded from bert.encoder.blocks.11.norm1.weight                   of shape (768,)
2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.norm2.bias                      loaded from bert.encoder.blocks.11.norm2.bias                     of shape (768,)
2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.norm2.weight                    loaded from bert.encoder.blocks.11.norm2.weight                   of shape (768,)
2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.attn.proj.bias                   loaded from bert.encoder.blocks.2.attn.proj.bias                  of shape (768,)
2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.attn.proj.weight                 loaded from bert.encoder.blocks.2.attn.proj.weight                of shape (768, 768)
2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.attn.qkv.bias                    loaded from bert.encoder.blocks.2.attn.qkv.bias                   of shape (2304,)
2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.attn.qkv.weight                  loaded from bert.encoder.blocks.2.attn.qkv.weight                 of shape (2304, 768)
2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.mlp.fc1.bias                     loaded from bert.encoder.blocks.2.mlp.fc1.bias                    of shape (3072,)
2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.mlp.fc1.weight                   loaded from bert.encoder.blocks.2.mlp.fc1.weight                  of shape (3072, 768)
2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.mlp.fc2.bias                     loaded from bert.encoder.blocks.2.mlp.fc2.bias                    of shape (768,)
2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.mlp.fc2.weight                   loaded from bert.encoder.blocks.2.mlp.fc2.weight                  of shape (768, 3072)
2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.norm1.bias                       loaded from bert.encoder.blocks.2.norm1.bias                      of shape (768,)
2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.norm1.weight                     loaded from bert.encoder.blocks.2.norm1.weight                    of shape (768,)
2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.norm2.bias                       loaded from bert.encoder.blocks.2.norm2.bias                      of shape (768,)
2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.norm2.weight                     loaded from bert.encoder.blocks.2.norm2.weight                    of shape (768,)
2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.attn.proj.bias                   loaded from bert.encoder.blocks.3.attn.proj.bias                  of shape (768,)
2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.attn.proj.weight                 loaded from bert.encoder.blocks.3.attn.proj.weight                of shape (768, 768)
2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.attn.qkv.bias                    loaded from bert.encoder.blocks.3.attn.qkv.bias                   of shape (2304,)
2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.attn.qkv.weight                  loaded from bert.encoder.blocks.3.attn.qkv.weight                 of shape (2304, 768)
2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.mlp.fc1.bias                     loaded from bert.encoder.blocks.3.mlp.fc1.bias                    of shape (3072,)
2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.mlp.fc1.weight                   loaded from bert.encoder.blocks.3.mlp.fc1.weight                  of shape (3072, 768)
2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.mlp.fc2.bias                     loaded from bert.encoder.blocks.3.mlp.fc2.bias                    of shape (768,)
2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.mlp.fc2.weight                   loaded from bert.encoder.blocks.3.mlp.fc2.weight                  of shape (768, 3072)
2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.norm1.bias                       loaded from bert.encoder.blocks.3.norm1.bias                      of shape (768,)
2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.norm1.weight                     loaded from bert.encoder.blocks.3.norm1.weight                    of shape (768,)
2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.norm2.bias                       loaded from bert.encoder.blocks.3.norm2.bias                      of shape (768,)
2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.norm2.weight                     loaded from bert.encoder.blocks.3.norm2.weight                    of shape (768,)
2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.attn.proj.bias                   loaded from bert.encoder.blocks.4.attn.proj.bias                  of shape (768,)
2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.attn.proj.weight                 loaded from bert.encoder.blocks.4.attn.proj.weight                of shape (768, 768)
2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.attn.qkv.bias                    loaded from bert.encoder.blocks.4.attn.qkv.bias                   of shape (2304,)
2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.attn.qkv.weight                  loaded from bert.encoder.blocks.4.attn.qkv.weight                 of shape (2304, 768)
2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.mlp.fc1.bias                     loaded from bert.encoder.blocks.4.mlp.fc1.bias                    of shape (3072,)
2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.mlp.fc1.weight                   loaded from bert.encoder.blocks.4.mlp.fc1.weight                  of shape (3072, 768)
2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.mlp.fc2.bias                     loaded from bert.encoder.blocks.4.mlp.fc2.bias                    of shape (768,)
2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.mlp.fc2.weight                   loaded from bert.encoder.blocks.4.mlp.fc2.weight                  of shape (768, 3072)
2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.norm1.bias                       loaded from bert.encoder.blocks.4.norm1.bias                      of shape (768,)
2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.norm1.weight                     loaded from bert.encoder.blocks.4.norm1.weight                    of shape (768,)
2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.norm2.bias                       loaded from bert.encoder.blocks.4.norm2.bias                      of shape (768,)
2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.norm2.weight                     loaded from bert.encoder.blocks.4.norm2.weight                    of shape (768,)
2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.attn.proj.bias                   loaded from bert.encoder.blocks.5.attn.proj.bias                  of shape (768,)
2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.attn.proj.weight                 loaded from bert.encoder.blocks.5.attn.proj.weight                of shape (768, 768)
2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.attn.qkv.bias                    loaded from bert.encoder.blocks.5.attn.qkv.bias                   of shape (2304,)
2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.attn.qkv.weight                  loaded from bert.encoder.blocks.5.attn.qkv.weight                 of shape (2304, 768)
2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.mlp.fc1.bias                     loaded from bert.encoder.blocks.5.mlp.fc1.bias                    of shape (3072,)
2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.mlp.fc1.weight                   loaded from bert.encoder.blocks.5.mlp.fc1.weight                  of shape (3072, 768)
2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.mlp.fc2.bias                     loaded from bert.encoder.blocks.5.mlp.fc2.bias                    of shape (768,)
2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.mlp.fc2.weight                   loaded from bert.encoder.blocks.5.mlp.fc2.weight                  of shape (768, 3072)
2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.norm1.bias                       loaded from bert.encoder.blocks.5.norm1.bias                      of shape (768,)
2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.norm1.weight                     loaded from bert.encoder.blocks.5.norm1.weight                    of shape (768,)
2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.norm2.bias                       loaded from bert.encoder.blocks.5.norm2.bias                      of shape (768,)
2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.norm2.weight                     loaded from bert.encoder.blocks.5.norm2.weight                    of shape (768,)
2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.attn.proj.bias                   loaded from bert.encoder.blocks.6.attn.proj.bias                  of shape (768,)
2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.attn.proj.weight                 loaded from bert.encoder.blocks.6.attn.proj.weight                of shape (768, 768)
2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.attn.qkv.bias                    loaded from bert.encoder.blocks.6.attn.qkv.bias                   of shape (2304,)
2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.attn.qkv.weight                  loaded from bert.encoder.blocks.6.attn.qkv.weight                 of shape (2304, 768)
2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.mlp.fc1.bias                     loaded from bert.encoder.blocks.6.mlp.fc1.bias                    of shape (3072,)
2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.mlp.fc1.weight                   loaded from bert.encoder.blocks.6.mlp.fc1.weight                  of shape (3072, 768)
2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.mlp.fc2.bias                     loaded from bert.encoder.blocks.6.mlp.fc2.bias                    of shape (768,)
2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.mlp.fc2.weight                   loaded from bert.encoder.blocks.6.mlp.fc2.weight                  of shape (768, 3072)
2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.norm1.bias                       loaded from bert.encoder.blocks.6.norm1.bias                      of shape (768,)
2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.norm1.weight                     loaded from bert.encoder.blocks.6.norm1.weight                    of shape (768,)
2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.norm2.bias                       loaded from bert.encoder.blocks.6.norm2.bias                      of shape (768,)
2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.norm2.weight                     loaded from bert.encoder.blocks.6.norm2.weight                    of shape (768,)
2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.attn.proj.bias                   loaded from bert.encoder.blocks.7.attn.proj.bias                  of shape (768,)
2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.attn.proj.weight                 loaded from bert.encoder.blocks.7.attn.proj.weight                of shape (768, 768)
2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.attn.qkv.bias                    loaded from bert.encoder.blocks.7.attn.qkv.bias                   of shape (2304,)
2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.attn.qkv.weight                  loaded from bert.encoder.blocks.7.attn.qkv.weight                 of shape (2304, 768)
2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.mlp.fc1.bias                     loaded from bert.encoder.blocks.7.mlp.fc1.bias                    of shape (3072,)
2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.mlp.fc1.weight                   loaded from bert.encoder.blocks.7.mlp.fc1.weight                  of shape (3072, 768)
2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.mlp.fc2.bias                     loaded from bert.encoder.blocks.7.mlp.fc2.bias                    of shape (768,)
2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.mlp.fc2.weight                   loaded from bert.encoder.blocks.7.mlp.fc2.weight                  of shape (768, 3072)
2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.norm1.bias                       loaded from bert.encoder.blocks.7.norm1.bias                      of shape (768,)
2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.norm1.weight                     loaded from bert.encoder.blocks.7.norm1.weight                    of shape (768,)
2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.norm2.bias                       loaded from bert.encoder.blocks.7.norm2.bias                      of shape (768,)
2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.norm2.weight                     loaded from bert.encoder.blocks.7.norm2.weight                    of shape (768,)
2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.attn.proj.bias                   loaded from bert.encoder.blocks.8.attn.proj.bias                  of shape (768,)
2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.attn.proj.weight                 loaded from bert.encoder.blocks.8.attn.proj.weight                of shape (768, 768)
2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.attn.qkv.bias                    loaded from bert.encoder.blocks.8.attn.qkv.bias                   of shape (2304,)
2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.attn.qkv.weight                  loaded from bert.encoder.blocks.8.attn.qkv.weight                 of shape (2304, 768)
2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.mlp.fc1.bias                     loaded from bert.encoder.blocks.8.mlp.fc1.bias                    of shape (3072,)
2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.mlp.fc1.weight                   loaded from bert.encoder.blocks.8.mlp.fc1.weight                  of shape (3072, 768)
2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.mlp.fc2.bias                     loaded from bert.encoder.blocks.8.mlp.fc2.bias                    of shape (768,)
2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.mlp.fc2.weight                   loaded from bert.encoder.blocks.8.mlp.fc2.weight                  of shape (768, 3072)
2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.norm1.bias                       loaded from bert.encoder.blocks.8.norm1.bias                      of shape (768,)
2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.norm1.weight                     loaded from bert.encoder.blocks.8.norm1.weight                    of shape (768,)
2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.norm2.bias                       loaded from bert.encoder.blocks.8.norm2.bias                      of shape (768,)
2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.norm2.weight                     loaded from bert.encoder.blocks.8.norm2.weight                    of shape (768,)
2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.attn.proj.bias                   loaded from bert.encoder.blocks.9.attn.proj.bias                  of shape (768,)
2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.attn.proj.weight                 loaded from bert.encoder.blocks.9.attn.proj.weight                of shape (768, 768)
2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.attn.qkv.bias                    loaded from bert.encoder.blocks.9.attn.qkv.bias                   of shape (2304,)
2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.attn.qkv.weight                  loaded from bert.encoder.blocks.9.attn.qkv.weight                 of shape (2304, 768)
2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.mlp.fc1.bias                     loaded from bert.encoder.blocks.9.mlp.fc1.bias                    of shape (3072,)
2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.mlp.fc1.weight                   loaded from bert.encoder.blocks.9.mlp.fc1.weight                  of shape (3072, 768)
2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.mlp.fc2.bias                     loaded from bert.encoder.blocks.9.mlp.fc2.bias                    of shape (768,)
2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.mlp.fc2.weight                   loaded from bert.encoder.blocks.9.mlp.fc2.weight                  of shape (768, 3072)
2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.norm1.bias                       loaded from bert.encoder.blocks.9.norm1.bias                      of shape (768,)
2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.norm1.weight                     loaded from bert.encoder.blocks.9.norm1.weight                    of shape (768,)
2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.norm2.bias                       loaded from bert.encoder.blocks.9.norm2.bias                      of shape (768,)
2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.norm2.weight                     loaded from bert.encoder.blocks.9.norm2.weight                    of shape (768,)
2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.pooler.dense.bias                                 loaded from bert.pooler.dense.bias                                of shape (768,)
2022-03-16 04:43:58,603.603 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.pooler.dense.weight                               loaded from bert.pooler.dense.weight                              of shape (768, 768)
2022-03-16 04:43:58,603.603 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.tag_logit.predictions.bias                        loaded from bert.tag_logit.predictions.bias                       of shape (30522,)
2022-03-16 04:43:58,603.603 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.tag_logit.predictions.decoder.weight              loaded from bert.tag_logit.predictions.decoder.weight             of shape (30522, 768)
2022-03-16 04:43:58,603.603 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.tag_logit.predictions.transform.LayerNorm.bias    loaded from bert.tag_logit.predictions.transform.LayerNorm.bias   of shape (768,)
2022-03-16 04:43:58,603.603 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.tag_logit.predictions.transform.LayerNorm.weight  loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,)
2022-03-16 04:43:58,603.603 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.tag_logit.predictions.transform.dense.bias        loaded from bert.tag_logit.predictions.transform.dense.bias       of shape (768,)
2022-03-16 04:43:58,603.603 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.tag_logit.predictions.transform.dense.weight      loaded from bert.tag_logit.predictions.transform.dense.weight     of shape (768, 768)
2022-03-16 04:43:58,603.603 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 158; loaded = 158
2022-03-16 04:43:58,604.604 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = []
2022-03-16 04:43:58,608.608 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = []; total = 0
2022-03-16 04:43:58,744.744 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.module.bert.embeddings.word_embeddings.weight',
 'module.module.bert.embeddings.position_embeddings.weight',
 'module.module.bert.embeddings.token_type_embeddings.weight',
 'module.module.bert.embeddings.LayerNorm.weight',
 'module.module.bert.embeddings.LayerNorm.bias',
 'module.module.bert.extra_embeddings.word_embeddings.weight',
 'module.module.bert.extra_embeddings.position_embeddings.weight',
 'module.module.bert.extra_embeddings.token_type_embeddings.weight',
 'module.module.bert.extra_embeddings.LayerNorm.weight',
 'module.module.bert.extra_embeddings.LayerNorm.bias',
 'module.module.bert.encoder.tag_blocks.0.norm1.weight',
 'module.module.bert.encoder.tag_blocks.0.norm1.bias',
 'module.module.bert.encoder.tag_blocks.0.attn.qkv.weight',
 'module.module.bert.encoder.tag_blocks.0.attn.qkv.bias',
 'module.module.bert.encoder.tag_blocks.0.attn.proj.weight',
 'module.module.bert.encoder.tag_blocks.0.attn.proj.bias',
 'module.module.bert.encoder.tag_blocks.0.norm2.weight',
 'module.module.bert.encoder.tag_blocks.0.norm2.bias',
 'module.module.bert.encoder.tag_blocks.0.mlp.fc1.weight',
 'module.module.bert.encoder.tag_blocks.0.mlp.fc1.bias',
 'module.module.bert.encoder.tag_blocks.0.mlp.fc2.weight',
 'module.module.bert.encoder.tag_blocks.0.mlp.fc2.bias',
 'module.module.bert.encoder.tag_blocks.1.norm1.weight',
 'module.module.bert.encoder.tag_blocks.1.norm1.bias',
 'module.module.bert.encoder.tag_blocks.1.attn.qkv.weight',
 'module.module.bert.encoder.tag_blocks.1.attn.qkv.bias',
 'module.module.bert.encoder.tag_blocks.1.attn.proj.weight',
 'module.module.bert.encoder.tag_blocks.1.attn.proj.bias',
 'module.module.bert.encoder.tag_blocks.1.norm2.weight',
 'module.module.bert.encoder.tag_blocks.1.norm2.bias',
 'module.module.bert.encoder.tag_blocks.1.mlp.fc1.weight',
 'module.module.bert.encoder.tag_blocks.1.mlp.fc1.bias',
 'module.module.bert.encoder.tag_blocks.1.mlp.fc2.weight',
 'module.module.bert.encoder.tag_blocks.1.mlp.fc2.bias',
 'module.module.bert.encoder.tag_blocks.2.norm1.weight',
 'module.module.bert.encoder.tag_blocks.2.norm1.bias',
 'module.module.bert.encoder.tag_blocks.2.attn.qkv.weight',
 'module.module.bert.encoder.tag_blocks.2.attn.qkv.bias',
 'module.module.bert.encoder.tag_blocks.2.attn.proj.weight',
 'module.module.bert.encoder.tag_blocks.2.attn.proj.bias',
 'module.module.bert.encoder.tag_blocks.2.norm2.weight',
 'module.module.bert.encoder.tag_blocks.2.norm2.bias',
 'module.module.bert.encoder.tag_blocks.2.mlp.fc1.weight',
 'module.module.bert.encoder.tag_blocks.2.mlp.fc1.bias',
 'module.module.bert.encoder.tag_blocks.2.mlp.fc2.weight',
 'module.module.bert.encoder.tag_blocks.2.mlp.fc2.bias',
 'module.module.bert.encoder.tag_blocks.3.norm1.weight',
 'module.module.bert.encoder.tag_blocks.3.norm1.bias',
 'module.module.bert.encoder.tag_blocks.3.attn.qkv.weight',
 'module.module.bert.encoder.tag_blocks.3.attn.qkv.bias',
 'module.module.bert.encoder.tag_blocks.3.attn.proj.weight',
 'module.module.bert.encoder.tag_blocks.3.attn.proj.bias',
 'module.module.bert.encoder.tag_blocks.3.norm2.weight',
 'module.module.bert.encoder.tag_blocks.3.norm2.bias',
 'module.module.bert.encoder.tag_blocks.3.mlp.fc1.weight',
 'module.module.bert.encoder.tag_blocks.3.mlp.fc1.bias',
 'module.module.bert.encoder.tag_blocks.3.mlp.fc2.weight',
 'module.module.bert.encoder.tag_blocks.3.mlp.fc2.bias',
 'module.module.bert.caption_pooler.dense.weight',
 'module.module.bert.caption_pooler.dense.bias',
 'module.module.bert.decoder.layer.0.attention.self.query.weight',
 'module.module.bert.decoder.layer.0.attention.self.query.bias',
 'module.module.bert.decoder.layer.0.attention.self.key.weight',
 'module.module.bert.decoder.layer.0.attention.self.key.bias',
 'module.module.bert.decoder.layer.0.attention.self.value.weight',
 'module.module.bert.decoder.layer.0.attention.self.value.bias',
 'module.module.bert.decoder.layer.0.attention.output.dense.weight',
 'module.module.bert.decoder.layer.0.attention.output.dense.bias',
 'module.module.bert.decoder.layer.0.attention.output.LayerNorm.weight',
 'module.module.bert.decoder.layer.0.attention.output.LayerNorm.bias',
 'module.module.bert.decoder.layer.0.intermediate.dense.weight',
 'module.module.bert.decoder.layer.0.intermediate.dense.bias',
 'module.module.bert.decoder.layer.0.output.dense.weight',
 'module.module.bert.decoder.layer.0.output.dense.bias',
 'module.module.bert.decoder.layer.0.output.LayerNorm.weight',
 'module.module.bert.decoder.layer.0.output.LayerNorm.bias',
 'module.module.bert.decoder.layer.1.attention.self.query.weight',
 'module.module.bert.decoder.layer.1.attention.self.query.bias',
 'module.module.bert.decoder.layer.1.attention.self.key.weight',
 'module.module.bert.decoder.layer.1.attention.self.key.bias',
 'module.module.bert.decoder.layer.1.attention.self.value.weight',
 'module.module.bert.decoder.layer.1.attention.self.value.bias',
 'module.module.bert.decoder.layer.1.attention.output.dense.weight',
 'module.module.bert.decoder.layer.1.attention.output.dense.bias',
 'module.module.bert.decoder.layer.1.attention.output.LayerNorm.weight',
 'module.module.bert.decoder.layer.1.attention.output.LayerNorm.bias',
 'module.module.bert.decoder.layer.1.intermediate.dense.weight',
 'module.module.bert.decoder.layer.1.intermediate.dense.bias',
 'module.module.bert.decoder.layer.1.output.dense.weight',
 'module.module.bert.decoder.layer.1.output.dense.bias',
 'module.module.bert.decoder.layer.1.output.LayerNorm.weight',
 'module.module.bert.decoder.layer.1.output.LayerNorm.bias',
 'module.module.bert.decoder.layer.2.attention.self.query.weight',
 'module.module.bert.decoder.layer.2.attention.self.query.bias',
 'module.module.bert.decoder.layer.2.attention.self.key.weight',
 'module.module.bert.decoder.layer.2.attention.self.key.bias',
 'module.module.bert.decoder.layer.2.attention.self.value.weight',
 'module.module.bert.decoder.layer.2.attention.self.value.bias',
 'module.module.bert.decoder.layer.2.attention.output.dense.weight',
 'module.module.bert.decoder.layer.2.attention.output.dense.bias',
 'module.module.bert.decoder.layer.2.attention.output.LayerNorm.weight',
 'module.module.bert.decoder.layer.2.attention.output.LayerNorm.bias',
 'module.module.bert.decoder.layer.2.intermediate.dense.weight',
 'module.module.bert.decoder.layer.2.intermediate.dense.bias',
 'module.module.bert.decoder.layer.2.output.dense.weight',
 'module.module.bert.decoder.layer.2.output.dense.bias',
 'module.module.bert.decoder.layer.2.output.LayerNorm.weight',
 'module.module.bert.decoder.layer.2.output.LayerNorm.bias',
 'module.module.bert.decoder.layer.3.attention.self.query.weight',
 'module.module.bert.decoder.layer.3.attention.self.query.bias',
 'module.module.bert.decoder.layer.3.attention.self.key.weight',
 'module.module.bert.decoder.layer.3.attention.self.key.bias',
 'module.module.bert.decoder.layer.3.attention.self.value.weight',
 'module.module.bert.decoder.layer.3.attention.self.value.bias',
 'module.module.bert.decoder.layer.3.attention.output.dense.weight',
 'module.module.bert.decoder.layer.3.attention.output.dense.bias',
 'module.module.bert.decoder.layer.3.attention.output.LayerNorm.weight',
 'module.module.bert.decoder.layer.3.attention.output.LayerNorm.bias',
 'module.module.bert.decoder.layer.3.intermediate.dense.weight',
 'module.module.bert.decoder.layer.3.intermediate.dense.bias',
 'module.module.bert.decoder.layer.3.output.dense.weight',
 'module.module.bert.decoder.layer.3.output.dense.bias',
 'module.module.bert.decoder.layer.3.output.LayerNorm.weight',
 'module.module.bert.decoder.layer.3.output.LayerNorm.bias',
 'module.module.cls.predictions.bias',
 'module.module.cls.predictions.transform.dense.weight',
 'module.module.cls.predictions.transform.dense.bias',
 'module.module.cls.predictions.transform.LayerNorm.weight',
 'module.module.cls.predictions.transform.LayerNorm.bias',
 'module.module.cls.predictions.decoder.weight']
2022-03-16 04:43:58,755.755 2829:tagger_caption_uni_pipeline_expanding.py:625      train(): <All keys matched successfully>
2022-03-16 04:44:27,988.988 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = <src.qd.data_layer.transform.LoadImage object at 0x7fcba4de0690>; data = TaxCocoCaption; split = train; add_key = False; backend = cv; hold_buffer = 0; save_original = False
2022-03-16 04:44:28,245.245 2829:samplers.py:158   __init__(): before making divisible = 17711
2022-03-16 04:44:28,245.245 2829:samplers.py:161   __init__(): adjust to = 17712
2022-03-16 04:44:28,245.245 2829:uni_pipeline.py:509 get_data_loader(): sampler = <src.qd.data_layer.samplers.DistributedSampler object at 0x7fcbcc847c50>
2022-03-16 04:44:28,246.246 2829:uni_pipeline.py:742   do_train(): DistributedDataParallel(
  (module): ImageCaptioning(
    (module): TaggerEncDecSplitForImageCaptioning(
      (bert): TaggerEncDecCLSEmbSplitBertImgModel(
        (embeddings): BertEmbeddings(
          (word_embeddings): Embedding(30522, 768, padding_idx=0)
          (position_embeddings): Embedding(512, 768)
          (token_type_embeddings): Embedding(2, 768)
          (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
          (dropout): Dropout(p=0, inplace=False)
        )
        (extra_embeddings): BertEmbeddings(
          (word_embeddings): Embedding(30522, 768, padding_idx=0)
          (position_embeddings): Embedding(512, 768)
          (token_type_embeddings): Embedding(2, 768)
          (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
          (dropout): Dropout(p=0, inplace=False)
        )
        (encoder): TIMMVitSplitEncoder(
          (blocks): ModuleList(
            (0): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (1): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (2): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (3): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (4): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (5): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (6): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (7): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (8): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (9): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (10): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (11): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
          )
          (tag_blocks): ModuleList(
            (0): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (1): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (2): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (3): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
          )
        )
        (caption_pooler): BertPooler(
          (dense): Linear(in_features=768, out_features=768, bias=True)
          (activation): Tanh()
        )
        (pooler): BertPooler(
          (dense): Linear(in_features=768, out_features=768, bias=True)
          (activation): Tanh()
        )
        (tag_logit): BertCaptioningHeads(
          (predictions): BertLMPredictionHead(
            (transform): BertPredictionHeadTransform(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            )
            (decoder): Linear(in_features=768, out_features=30522, bias=False)
          )
        )
        (decoder): BertEncoder(
          (layer): ModuleList(
            (0): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                  (dropout): Dropout(p=0, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0, inplace=False)
              )
            )
            (1): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                  (dropout): Dropout(p=0, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0, inplace=False)
              )
            )
            (2): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                  (dropout): Dropout(p=0, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0, inplace=False)
              )
            )
            (3): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                  (dropout): Dropout(p=0, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0, inplace=False)
              )
            )
          )
        )
        (img_embedding): Identity()
        (dropout): Identity()
      )
      (cls): BertCaptioningHeads(
        (predictions): BertLMPredictionHead(
          (transform): BertPredictionHeadTransform(
            (dense): Linear(in_features=768, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
          )
          (decoder): Linear(in_features=768, out_features=30522, bias=False)
        )
      )
      (loss): BertCaptioningLoss(
        (log_soft): LogSoftmax(dim=1)
        (kl): KLDivLoss()
      )
      (tag_loss): FocalLossWithLogitsNegLoss(alpha=0.5, gamma=1)
    )
    (image_encoder): InputAsDict(
      (module): VisionTransformer(
        (patch_embed): PatchEmbed(
          (proj): Conv2d(3, 768, kernel_size=(16, 16), stride=(16, 16))
        )
        (pos_drop): Dropout(p=0.0, inplace=False)
        (blocks): ModuleList()
        (norm): Identity()
        (pre_logits): Identity()
        (head): Linear(in_features=768, out_features=1000, bias=True)
      )
    )
  )
)
2022-03-16 04:44:28,251.251 2829:uni_pipeline.py:744   do_train(): : training=True
2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744   do_train(): module: training=True
2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744   do_train(): module.module: training=True
2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744   do_train(): module.module.bert: training=True
2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744   do_train(): module.module.bert.embeddings: training=True
2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744   do_train(): module.module.bert.embeddings.word_embeddings: training=True
2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744   do_train(): module.module.bert.embeddings.position_embeddings: training=True
2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744   do_train(): module.module.bert.embeddings.token_type_embeddings: training=True
2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744   do_train(): module.module.bert.embeddings.LayerNorm: training=True
2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744   do_train(): module.module.bert.embeddings.dropout: training=True
2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744   do_train(): module.module.bert.extra_embeddings: training=True
2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744   do_train(): module.module.bert.extra_embeddings.word_embeddings: training=True
2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744   do_train(): module.module.bert.extra_embeddings.position_embeddings: training=True
2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744   do_train(): module.module.bert.extra_embeddings.token_type_embeddings: training=True
2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744   do_train(): module.module.bert.extra_embeddings.LayerNorm: training=True
2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744   do_train(): module.module.bert.extra_embeddings.dropout: training=True
2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder: training=True
2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks: training=True
2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.0: training=True
2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.0.norm1: training=True
2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.0.attn: training=True
2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.0.attn.qkv: training=True
2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.0.attn.attn_drop: training=True
2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.0.attn.proj: training=True
2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.0.attn.proj_drop: training=True
2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.0.drop_path: training=True
2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.0.norm2: training=True
2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.0.mlp: training=True
2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.0.mlp.fc1: training=True
2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.0.mlp.act: training=True
2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.0.mlp.fc2: training=True
2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.0.mlp.drop: training=True
2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.1: training=True
2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.1.norm1: training=True
2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.1.attn: training=True
2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.1.attn.qkv: training=True
2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.1.attn.attn_drop: training=True
2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.1.attn.proj: training=True
2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.1.attn.proj_drop: training=True
2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.1.drop_path: training=True
2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.1.norm2: training=True
2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.1.mlp: training=True
2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.1.mlp.fc1: training=True
2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.1.mlp.act: training=True
2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.1.mlp.fc2: training=True
2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.1.mlp.drop: training=True
2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.2: training=True
2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.2.norm1: training=True
2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.2.attn: training=True
2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.2.attn.qkv: training=True
2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.2.attn.attn_drop: training=True
2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.2.attn.proj: training=True
2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.2.attn.proj_drop: training=True
2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.2.drop_path: training=True
2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.2.norm2: training=True
2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.2.mlp: training=True
2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.2.mlp.fc1: training=True
2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.2.mlp.act: training=True
2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.2.mlp.fc2: training=True
2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.2.mlp.drop: training=True
2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.3: training=True
2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.3.norm1: training=True
2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.3.attn: training=True
2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.3.attn.qkv: training=True
2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.3.attn.attn_drop: training=True
2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.3.attn.proj: training=True
2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.3.attn.proj_drop: training=True
2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.3.drop_path: training=True
2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.3.norm2: training=True
2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.3.mlp: training=True
2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.3.mlp.fc1: training=True
2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.3.mlp.act: training=True
2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.3.mlp.fc2: training=True
2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.3.mlp.drop: training=True
2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.4: training=True
2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.4.norm1: training=True
2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.4.attn: training=True
2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.4.attn.qkv: training=True
2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.4.attn.attn_drop: training=True
2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.4.attn.proj: training=True
2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.4.attn.proj_drop: training=True
2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.4.drop_path: training=True
2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.4.norm2: training=True
2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.4.mlp: training=True
2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.4.mlp.fc1: training=True
2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.4.mlp.act: training=True
2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.4.mlp.fc2: training=True
2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.4.mlp.drop: training=True
2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.5: training=True
2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.5.norm1: training=True
2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.5.attn: training=True
2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.5.attn.qkv: training=True
2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.5.attn.attn_drop: training=True
2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.5.attn.proj: training=True
2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.5.attn.proj_drop: training=True
2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.5.drop_path: training=True
2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.5.norm2: training=True
2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.5.mlp: training=True
2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.5.mlp.fc1: training=True
2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.5.mlp.act: training=True
2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.5.mlp.fc2: training=True
2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.5.mlp.drop: training=True
2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.6: training=True
2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.6.norm1: training=True
2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.6.attn: training=True
2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.6.attn.qkv: training=True
2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.6.attn.attn_drop: training=True
2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.6.attn.proj: training=True
2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.6.attn.proj_drop: training=True
2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.6.drop_path: training=True
2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.6.norm2: training=True
2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.6.mlp: training=True
2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.6.mlp.fc1: training=True
2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.6.mlp.act: training=True
2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.6.mlp.fc2: training=True
2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.6.mlp.drop: training=True
2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.7: training=True
2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.7.norm1: training=True
2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.7.attn: training=True
2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.7.attn.qkv: training=True
2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.7.attn.attn_drop: training=True
2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.7.attn.proj: training=True
2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.7.attn.proj_drop: training=True
2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.7.drop_path: training=True
2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.7.norm2: training=True
2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.7.mlp: training=True
2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.7.mlp.fc1: training=True
2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.7.mlp.act: training=True
2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.7.mlp.fc2: training=True
2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.7.mlp.drop: training=True
2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.8: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.8.norm1: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.8.attn: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.8.attn.qkv: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.8.attn.attn_drop: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.8.attn.proj: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.8.attn.proj_drop: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.8.drop_path: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.8.norm2: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.8.mlp: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.8.mlp.fc1: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.8.mlp.act: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.8.mlp.fc2: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.8.mlp.drop: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.9: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.9.norm1: training=True
2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.9.attn: training=True
2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.9.attn.qkv: training=True
2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.9.attn.attn_drop: training=True
2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.9.attn.proj: training=True
2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.9.attn.proj_drop: training=True
2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.9.drop_path: training=True
2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.9.norm2: training=True
2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.9.mlp: training=True
2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.9.mlp.fc1: training=True
2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.9.mlp.act: training=True
2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.9.mlp.fc2: training=True
2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.9.mlp.drop: training=True
2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.10: training=True
2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.10.norm1: training=True
2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.10.attn: training=True
2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.10.attn.qkv: training=True
2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.10.attn.attn_drop: training=True
2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.10.attn.proj: training=True
2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.10.attn.proj_drop: training=True
2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.10.drop_path: training=True
2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.10.norm2: training=True
2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.10.mlp: training=True
2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.10.mlp.fc1: training=True
2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.10.mlp.act: training=True
2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.10.mlp.fc2: training=True
2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.10.mlp.drop: training=True
2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.11: training=True
2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.11.norm1: training=True
2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.11.attn: training=True
2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.11.attn.qkv: training=True
2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.11.attn.attn_drop: training=True
2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.11.attn.proj: training=True
2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.11.attn.proj_drop: training=True
2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.11.drop_path: training=True
2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.11.norm2: training=True
2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.11.mlp: training=True
2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.11.mlp.fc1: training=True
2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.11.mlp.act: training=True
2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.11.mlp.fc2: training=True
2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.blocks.11.mlp.drop: training=True
2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks: training=True
2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.0: training=True
2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.0.norm1: training=True
2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.0.attn: training=True
2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.0.attn.qkv: training=True
2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.0.attn.attn_drop: training=True
2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.0.attn.proj: training=True
2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.0.attn.proj_drop: training=True
2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.0.drop_path: training=True
2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.0.norm2: training=True
2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.0.mlp: training=True
2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.0.mlp.fc1: training=True
2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.0.mlp.act: training=True
2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.0.mlp.fc2: training=True
2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.0.mlp.drop: training=True
2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.1: training=True
2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.1.norm1: training=True
2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.1.attn: training=True
2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.1.attn.qkv: training=True
2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.1.attn.attn_drop: training=True
2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.1.attn.proj: training=True
2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.1.attn.proj_drop: training=True
2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.1.drop_path: training=True
2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.1.norm2: training=True
2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.1.mlp: training=True
2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.1.mlp.fc1: training=True
2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.1.mlp.act: training=True
2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.1.mlp.fc2: training=True
2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.1.mlp.drop: training=True
2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.2: training=True
2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.2.norm1: training=True
2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.2.attn: training=True
2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.2.attn.qkv: training=True
2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.2.attn.attn_drop: training=True
2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.2.attn.proj: training=True
2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.2.attn.proj_drop: training=True
2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.2.drop_path: training=True
2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.2.norm2: training=True
2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.2.mlp: training=True
2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.2.mlp.fc1: training=True
2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.2.mlp.act: training=True
2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.2.mlp.fc2: training=True
2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.2.mlp.drop: training=True
2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.3: training=True
2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.3.norm1: training=True
2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.3.attn: training=True
2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.3.attn.qkv: training=True
2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.3.attn.attn_drop: training=True
2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.3.attn.proj: training=True
2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.3.attn.proj_drop: training=True
2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.3.drop_path: training=True
2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.3.norm2: training=True
2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.3.mlp: training=True
2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.3.mlp.fc1: training=True
2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.3.mlp.act: training=True
2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.3.mlp.fc2: training=True
2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744   do_train(): module.module.bert.encoder.tag_blocks.3.mlp.drop: training=True
2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744   do_train(): module.module.bert.caption_pooler: training=True
2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744   do_train(): module.module.bert.caption_pooler.dense: training=True
2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744   do_train(): module.module.bert.caption_pooler.activation: training=True
2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744   do_train(): module.module.bert.pooler: training=True
2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744   do_train(): module.module.bert.pooler.dense: training=True
2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744   do_train(): module.module.bert.pooler.activation: training=True
2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744   do_train(): module.module.bert.tag_logit: training=True
2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744   do_train(): module.module.bert.tag_logit.predictions: training=True
2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744   do_train(): module.module.bert.tag_logit.predictions.transform: training=True
2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744   do_train(): module.module.bert.tag_logit.predictions.transform.dense: training=True
2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744   do_train(): module.module.bert.tag_logit.predictions.transform.LayerNorm: training=True
2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744   do_train(): module.module.bert.tag_logit.predictions.decoder: training=True
2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder: training=True
2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer: training=True
2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0: training=True
2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.attention: training=True
2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.attention.self: training=True
2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.attention.self.query: training=True
2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.attention.self.key: training=True
2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.attention.self.value: training=True
2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.attention.self.dropout: training=True
2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.attention.output: training=True
2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.attention.output.dense: training=True
2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.attention.output.LayerNorm: training=True
2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.attention.output.dropout: training=True
2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.intermediate: training=True
2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.intermediate.dense: training=True
2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.output: training=True
2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.output.dense: training=True
2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.output.LayerNorm: training=True
2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.0.output.dropout: training=True
2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1: training=True
2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.attention: training=True
2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.attention.self: training=True
2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.attention.self.query: training=True
2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.attention.self.key: training=True
2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.attention.self.value: training=True
2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.attention.self.dropout: training=True
2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.attention.output: training=True
2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.attention.output.dense: training=True
2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.attention.output.LayerNorm: training=True
2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.attention.output.dropout: training=True
2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.intermediate: training=True
2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.intermediate.dense: training=True
2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.output: training=True
2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.output.dense: training=True
2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.output.LayerNorm: training=True
2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.1.output.dropout: training=True
2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2: training=True
2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.attention: training=True
2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.attention.self: training=True
2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.attention.self.query: training=True
2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.attention.self.key: training=True
2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.attention.self.value: training=True
2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.attention.self.dropout: training=True
2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.attention.output: training=True
2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.attention.output.dense: training=True
2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.attention.output.LayerNorm: training=True
2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.attention.output.dropout: training=True
2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.intermediate: training=True
2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.intermediate.dense: training=True
2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.output: training=True
2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.output.dense: training=True
2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.output.LayerNorm: training=True
2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.2.output.dropout: training=True
2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3: training=True
2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.attention: training=True
2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.attention.self: training=True
2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.attention.self.query: training=True
2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.attention.self.key: training=True
2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.attention.self.value: training=True
2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.attention.self.dropout: training=True
2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.attention.output: training=True
2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.attention.output.dense: training=True
2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.attention.output.LayerNorm: training=True
2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.attention.output.dropout: training=True
2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.intermediate: training=True
2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.intermediate.dense: training=True
2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.output: training=True
2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.output.dense: training=True
2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.output.LayerNorm: training=True
2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744   do_train(): module.module.bert.decoder.layer.3.output.dropout: training=True
2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744   do_train(): module.module.bert.img_embedding: training=True
2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744   do_train(): module.module.bert.dropout: training=True
2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744   do_train(): module.module.cls: training=True
2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744   do_train(): module.module.cls.predictions: training=True
2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744   do_train(): module.module.cls.predictions.transform: training=True
2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744   do_train(): module.module.cls.predictions.transform.dense: training=True
2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744   do_train(): module.module.cls.predictions.transform.LayerNorm: training=True
2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744   do_train(): module.module.cls.predictions.decoder: training=True
2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744   do_train(): module.module.loss: training=True
2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744   do_train(): module.module.loss.log_soft: training=True
2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744   do_train(): module.module.loss.kl: training=True
2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744   do_train(): module.module.tag_loss: training=True
2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744   do_train(): module.image_encoder: training=True
2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744   do_train(): module.image_encoder.module: training=True
2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744   do_train(): module.image_encoder.module.patch_embed: training=True
2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744   do_train(): module.image_encoder.module.patch_embed.proj: training=True
2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744   do_train(): module.image_encoder.module.pos_drop: training=True
2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744   do_train(): module.image_encoder.module.blocks: training=True
2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744   do_train(): module.image_encoder.module.norm: training=True
2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744   do_train(): module.image_encoder.module.pre_logits: training=True
2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744   do_train(): module.image_encoder.module.head: training=True
2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:745   do_train(): dataset = 
DatasetPlusTransform(dataset=CaptionIdxTSVDataset(data=TaxCocoCaption, split=train, caption_version=None), transform=Compose(
    <src.qd.data_layer.transform.LoadHW object at 0x7fcbce293590>
    Compose(
    <src.qd.data_layer.transform.LoadImage object at 0x7fcba4de0690>
    ImageTransform2Dict(image_transform=Compose(
    <src.qd.data_layer.transform.BGR2RGB object at 0x7fcbcd853610>
    ToPILImage()
    RandomResizedCrop(size=(384, 384), scale=(0.08, 1.0), ratio=(0.75, 1.3333), interpolation=PIL.Image.BILINEAR)
    ColorJitter(brightness=[0.6, 1.4], contrast=[0.6, 1.4], saturation=[0.6, 1.4], hue=None)
    RandomHorizontalFlip(p=0.5)
    ToTensor()
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
))
)
    LoadCaption(tsv=TSVSplitProperty(tsv=CompositeTSVFile(list_file=data/TaxCocoCaption/train.shuffle.txt, seq_file=data/TaxCocoCaption/trainX.caption.tsv)))
    LoadLabel(data=TaxCocoCaption, split=train, version=vinvl)
    <src.qd.data_layer.transform.IdentifyTextAB object at 0x7fcbcc66d050>
    TransCaptionTensorizer(tensorizer=<src.qd.mask.data.datasets.caption_tensorizer.CaptionTensorizer object at 0x7fcbcc66d210>, pad_to_max=True, pad_image_to_max=True)
    <src.qd.pipelines.tagger_caption_uni_pipeline_expanding.Tensorizer object at 0x7fcbcc66d090>
    <src.qd.data_layer.transform.RemoveUselessKeys object at 0x7fcba4e00550>
    <src.qd.data_layer.transform.RenameKey object at 0x7fcbcdcc9f10>
))
2022-03-16 04:44:28,281.281 2829:trainer.py:367 do_train_dict(): Start training
2022-03-16 04:44:28,283.283 2829:qd_common.py:3452 print_frame_info(): func name = do_train_dict; model = DistributedDataParallel(
  (module): ImageCaptioning(
    (module): TaggerEncDecSplitForImageCaptioning(
      (bert): TaggerEncDecCLSEmbSplitBertImgModel(
        (embeddings): BertEmbeddings(
          (word_embeddings): Embedding(30522, 768, padding_idx=0)
          (position_embeddings): Embedding(512, 768)
          (token_type_embeddings): Embedding(2, 768)
          (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
          (dropout): Dropout(p=0, inplace=False)
        )
        (extra_embeddings): BertEmbeddings(
          (word_embeddings): Embedding(30522, 768, padding_idx=0)
          (position_embeddings): Embedding(512, 768)
          (token_type_embeddings): Embedding(2, 768)
          (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
          (dropout): Dropout(p=0, inplace=False)
        )
        (encoder): TIMMVitSplitEncoder(
          (blocks): ModuleList(
            (0): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (1): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (2): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (3): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (4): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (5): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (6): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (7): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (8): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (9): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (10): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (11): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
          )
          (tag_blocks): ModuleList(
            (0): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (1): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (2): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
            (3): Block(
              (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (attn): Attention(
                (qkv): Linear(in_features=768, out_features=2304, bias=True)
                (attn_drop): Dropout(p=0.0, inplace=False)
                (proj): Linear(in_features=768, out_features=768, bias=True)
                (proj_drop): Dropout(p=0.0, inplace=False)
              )
              (drop_path): Identity()
              (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): Linear(in_features=768, out_features=3072, bias=True)
                (act): GELU()
                (fc2): Linear(in_features=3072, out_features=768, bias=True)
                (drop): Dropout(p=0.0, inplace=False)
              )
            )
          )
        )
        (caption_pooler): BertPooler(
          (dense): Linear(in_features=768, out_features=768, bias=True)
          (activation): Tanh()
        )
        (pooler): BertPooler(
          (dense): Linear(in_features=768, out_features=768, bias=True)
          (activation): Tanh()
        )
        (tag_logit): BertCaptioningHeads(
          (predictions): BertLMPredictionHead(
            (transform): BertPredictionHeadTransform(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            )
            (decoder): Linear(in_features=768, out_features=30522, bias=False)
          )
        )
        (decoder): BertEncoder(
          (layer): ModuleList(
            (0): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                  (dropout): Dropout(p=0, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0, inplace=False)
              )
            )
            (1): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                  (dropout): Dropout(p=0, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0, inplace=False)
              )
            )
            (2): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                  (dropout): Dropout(p=0, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0, inplace=False)
              )
            )
            (3): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                  (dropout): Dropout(p=0, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0, inplace=False)
              )
            )
          )
        )
        (img_embedding): Identity()
        (dropout): Identity()
      )
      (cls): BertCaptioningHeads(
        (predictions): BertLMPredictionHead(
          (transform): BertPredictionHeadTransform(
            (dense): Linear(in_features=768, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
          )
          (decoder): Linear(in_features=768, out_features=30522, bias=False)
        )
      )
      (loss): BertCaptioningLoss(
        (log_soft): LogSoftmax(dim=1)
        (kl): KLDivLoss()
      )
      (tag_loss): FocalLossWithLogitsNegLoss(alpha=0.5, gamma=1)
    )
    (image_encoder): InputAsDict(
      (module): VisionTransformer(
        (patch_embed): PatchEmbed(
          (proj): Conv2d(3, 768, kernel_size=(16, 16), stride=(16, 16))
        )
        (pos_drop): Dropout(p=0.0, inplace=False)
        (blocks): ModuleList()
        (norm): Identity()
        (pre_logits): Identity()
        (head): Linear(in_features=768, out_features=1000, bias=True)
      )
    )
  )
); data_loader = <torch.utils.data.dataloader.DataLoader object at 0x7fcbcc847cd0>; optimizer = AdamW (
Parameter Group 0
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 0.0001
    lr: 0.0001
    param_names: ['module.cls_token', 'module.pos_embed', 'module.patch_embed.proj.weight', 'module.head.weight']
    weight_decay: 0.05

Parameter Group 1
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 0.0001
    lr: 0.0001
    param_names: ['module.patch_embed.proj.bias', 'module.head.bias']
    weight_decay: 0.0

Parameter Group 2
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 0.0001
    lr: 0.0001
    param_names: ['word_embeddings.weight', 'position_embeddings.weight', 'token_type_embeddings.weight']
    weight_decay: 0.05

Parameter Group 3
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 0.0001
    lr: 0.0001
    param_names: ['LayerNorm.weight', 'LayerNorm.bias']
    weight_decay: 0

Parameter Group 4
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 1e-05
    lr: 1e-05
    param_names: ['0.norm1.weight', '0.attn.qkv.weight', '0.attn.proj.weight', '0.norm2.weight', '0.mlp.fc1.weight', '0.mlp.fc2.weight', '1.norm1.weight', '1.attn.qkv.weight', '1.attn.proj.weight', '1.norm2.weight', '1.mlp.fc1.weight', '1.mlp.fc2.weight', '2.norm1.weight', '2.attn.qkv.weight', '2.attn.proj.weight', '2.norm2.weight', '2.mlp.fc1.weight', '2.mlp.fc2.weight', '3.norm1.weight', '3.attn.qkv.weight', '3.attn.proj.weight', '3.norm2.weight', '3.mlp.fc1.weight', '3.mlp.fc2.weight', '4.norm1.weight', '4.attn.qkv.weight', '4.attn.proj.weight', '4.norm2.weight', '4.mlp.fc1.weight', '4.mlp.fc2.weight', '5.norm1.weight', '5.attn.qkv.weight', '5.attn.proj.weight', '5.norm2.weight', '5.mlp.fc1.weight', '5.mlp.fc2.weight', '6.norm1.weight', '6.attn.qkv.weight', '6.attn.proj.weight', '6.norm2.weight', '6.mlp.fc1.weight', '6.mlp.fc2.weight', '7.norm1.weight', '7.attn.qkv.weight', '7.attn.proj.weight', '7.norm2.weight', '7.mlp.fc1.weight', '7.mlp.fc2.weight']
    weight_decay: 0.05

Parameter Group 5
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 1e-05
    lr: 1e-05
    param_names: ['0.norm1.bias', '0.attn.qkv.bias', '0.attn.proj.bias', '0.norm2.bias', '0.mlp.fc1.bias', '0.mlp.fc2.bias', '1.norm1.bias', '1.attn.qkv.bias', '1.attn.proj.bias', '1.norm2.bias', '1.mlp.fc1.bias', '1.mlp.fc2.bias', '2.norm1.bias', '2.attn.qkv.bias', '2.attn.proj.bias', '2.norm2.bias', '2.mlp.fc1.bias', '2.mlp.fc2.bias', '3.norm1.bias', '3.attn.qkv.bias', '3.attn.proj.bias', '3.norm2.bias', '3.mlp.fc1.bias', '3.mlp.fc2.bias', '4.norm1.bias', '4.attn.qkv.bias', '4.attn.proj.bias', '4.norm2.bias', '4.mlp.fc1.bias', '4.mlp.fc2.bias', '5.norm1.bias', '5.attn.qkv.bias', '5.attn.proj.bias', '5.norm2.bias', '5.mlp.fc1.bias', '5.mlp.fc2.bias', '6.norm1.bias', '6.attn.qkv.bias', '6.attn.proj.bias', '6.norm2.bias', '6.mlp.fc1.bias', '6.mlp.fc2.bias', '7.norm1.bias', '7.attn.qkv.bias', '7.attn.proj.bias', '7.norm2.bias', '7.mlp.fc1.bias', '7.mlp.fc2.bias']
    weight_decay: 0.0

Parameter Group 6
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 0.0001
    lr: 0.0001
    param_names: ['0.norm1.weight', '0.attn.qkv.weight', '0.attn.proj.weight', '0.norm2.weight', '0.mlp.fc1.weight', '0.mlp.fc2.weight', '1.norm1.weight', '1.attn.qkv.weight', '1.attn.proj.weight', '1.norm2.weight', '1.mlp.fc1.weight', '1.mlp.fc2.weight', '2.norm1.weight', '2.attn.qkv.weight', '2.attn.proj.weight', '2.norm2.weight', '2.mlp.fc1.weight', '2.mlp.fc2.weight', '3.norm1.weight', '3.attn.qkv.weight', '3.attn.proj.weight', '3.norm2.weight', '3.mlp.fc1.weight', '3.mlp.fc2.weight']
    weight_decay: 0.05

Parameter Group 7
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 0.0001
    lr: 0.0001
    param_names: ['0.norm1.bias', '0.attn.qkv.bias', '0.attn.proj.bias', '0.norm2.bias', '0.mlp.fc1.bias', '0.mlp.fc2.bias', '1.norm1.bias', '1.attn.qkv.bias', '1.attn.proj.bias', '1.norm2.bias', '1.mlp.fc1.bias', '1.mlp.fc2.bias', '2.norm1.bias', '2.attn.qkv.bias', '2.attn.proj.bias', '2.norm2.bias', '2.mlp.fc1.bias', '2.mlp.fc2.bias', '3.norm1.bias', '3.attn.qkv.bias', '3.attn.proj.bias', '3.norm2.bias', '3.mlp.fc1.bias', '3.mlp.fc2.bias']
    weight_decay: 0.0

Parameter Group 8
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 1e-05
    lr: 1e-05
    param_names: ['0.norm1.weight', '0.attn.qkv.weight', '0.attn.proj.weight', '0.norm2.weight', '0.mlp.fc1.weight', '0.mlp.fc2.weight', '1.norm1.weight', '1.attn.qkv.weight', '1.attn.proj.weight', '1.norm2.weight', '1.mlp.fc1.weight', '1.mlp.fc2.weight', '2.norm1.weight', '2.attn.qkv.weight', '2.attn.proj.weight', '2.norm2.weight', '2.mlp.fc1.weight', '2.mlp.fc2.weight', '3.norm1.weight', '3.attn.qkv.weight', '3.attn.proj.weight', '3.norm2.weight', '3.mlp.fc1.weight', '3.mlp.fc2.weight']
    weight_decay: 0.05

Parameter Group 9
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 1e-05
    lr: 1e-05
    param_names: ['0.norm1.bias', '0.attn.qkv.bias', '0.attn.proj.bias', '0.norm2.bias', '0.mlp.fc1.bias', '0.mlp.fc2.bias', '1.norm1.bias', '1.attn.qkv.bias', '1.attn.proj.bias', '1.norm2.bias', '1.mlp.fc1.bias', '1.mlp.fc2.bias', '2.norm1.bias', '2.attn.qkv.bias', '2.attn.proj.bias', '2.norm2.bias', '2.mlp.fc1.bias', '2.mlp.fc2.bias', '3.norm1.bias', '3.attn.qkv.bias', '3.attn.proj.bias', '3.norm2.bias', '3.mlp.fc1.bias', '3.mlp.fc2.bias']
    weight_decay: 0.0

Parameter Group 10
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 0.0001
    lr: 0.0001
    param_names: ['dense.weight']
    weight_decay: 0.05

Parameter Group 11
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 0.0001
    lr: 0.0001
    param_names: ['dense.bias']
    weight_decay: 0.0

Parameter Group 12
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 1e-05
    lr: 1e-05
    param_names: ['dense.weight']
    weight_decay: 0.05

Parameter Group 13
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 1e-05
    lr: 1e-05
    param_names: ['dense.bias']
    weight_decay: 0.0

Parameter Group 14
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 1e-05
    lr: 1e-05
    param_names: ['predictions.bias', 'predictions.transform.dense.bias', 'predictions.transform.LayerNorm.weight', 'predictions.transform.LayerNorm.bias']
    weight_decay: 0.0

Parameter Group 15
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 1e-05
    lr: 1e-05
    param_names: ['predictions.transform.dense.weight', 'predictions.decoder.weight']
    weight_decay: 0.05

Parameter Group 16
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 0.0001
    lr: 0.0001
    param_names: ['layer.0.attention.self.query.weight', 'layer.0.attention.self.key.weight', 'layer.0.attention.self.value.weight', 'layer.0.attention.output.dense.weight', 'layer.0.intermediate.dense.weight', 'layer.0.output.dense.weight', 'layer.1.attention.self.query.weight', 'layer.1.attention.self.key.weight', 'layer.1.attention.self.value.weight', 'layer.1.attention.output.dense.weight', 'layer.1.intermediate.dense.weight', 'layer.1.output.dense.weight', 'layer.2.attention.self.query.weight', 'layer.2.attention.self.key.weight', 'layer.2.attention.self.value.weight', 'layer.2.attention.output.dense.weight', 'layer.2.intermediate.dense.weight', 'layer.2.output.dense.weight', 'layer.3.attention.self.query.weight', 'layer.3.attention.self.key.weight', 'layer.3.attention.self.value.weight', 'layer.3.attention.output.dense.weight', 'layer.3.intermediate.dense.weight', 'layer.3.output.dense.weight']
    weight_decay: 0.05

Parameter Group 17
    betas: (0.9, 0.999)
    correct_bias: True
    eps: 1e-08
    initial_lr: 0.0001
    lr: 0.0001
    param_names: ['layer.0.attention.self.query.bias', 'layer.0.attention.self.key.bias', 'layer.0.attention.self.value.bias', 'layer.0.attention.output.dense.bias', 'layer.0.attention.output.LayerNorm.weight', 'layer.0.attention.output.LayerNorm.bias', 'layer.0.intermediate.dense.bias', 'layer.0.output.dense.bias', 'layer.0.output.LayerNorm.weight', 'layer.0.output.LayerNorm.bias', 'layer.1.attention.self.query.bias', 'layer.1.attention.self.key.bias', 'layer.1.attention.self.value.bias', 'layer.1.attention.output.dense.bias', 'layer.1.attention.output.LayerNorm.weight', 'layer.1.attention.output.LayerNorm.bias', 'layer.1.intermediate.dense.bias', 'layer.1.output.dense.bias', 'layer.1.output.LayerNorm.weight', 'layer.1.output.LayerNorm.bias', 'layer.2.attention.self.query.bias', 'layer.2.attention.self.key.bias', 'layer.2.attention.self.value.bias', 'layer.2.attention.output.dense.bias', 'layer.2.attention.output.LayerNorm.weight', 'layer.2.attention.output.LayerNorm.bias', 'layer.2.intermediate.dense.bias', 'layer.2.output.dense.bias', 'layer.2.output.LayerNorm.weight', 'layer.2.output.LayerNorm.bias', 'layer.3.attention.self.query.bias', 'layer.3.attention.self.key.bias', 'layer.3.attention.self.value.bias', 'layer.3.attention.output.dense.bias', 'layer.3.attention.output.LayerNorm.weight', 'layer.3.attention.output.LayerNorm.bias', 'layer.3.intermediate.dense.bias', 'layer.3.output.dense.bias', 'layer.3.output.LayerNorm.weight', 'layer.3.output.LayerNorm.bias']
    weight_decay: 0.0
); scheduler = <src.qd.mask.solver.optimization.WarmupLinearSchedule object at 0x7fcbcd9ebb50>; checkpointer = <src.qd.opt.checkpoint.Checkpointer object at 0x7fcbcd9ebc50>; device = cuda; checkpoint_period = 5000; arguments = {'iteration': 0}; log_step = 100; data_partition = 1; explicit_average_grad = False; no_update = False; ema = None; use_amp = False; gradient_clip = 1.0; model_sub_name_fn = <function get_model_sub_name at 0x7fcbcdd6c050>
/opt/conda/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:216: UserWarning: Please also save or load the state of the optimizer when saving or loading the scheduler.
  warnings.warn(SAVE_STATE_WARNING, UserWarning)
2022-03-16 04:44:28,287.287 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0000000.pt
2022-03-16 04:45:52,906.906 3427:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=44475
2022-03-16 04:45:52,906.906 3424:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=7048
2022-03-16 04:45:52,906.906 3425:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=12705
2022-03-16 04:45:52,907.907 3430:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=91679
2022-03-16 04:45:52,907.907 3426:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=25979
2022-03-16 04:45:52,907.907 3429:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=94718
2022-03-16 04:45:52,908.908 3428:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=25521
2022-03-16 04:45:52,908.908 3431:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=47118
/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1639: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
2022-03-16 04:46:01,792.792 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.0
2022-03-16 04:46:01,792.792 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 212.95741271972656
2022-03-16 04:46:01,792.792 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.28994750976562
2022-03-16 04:46:04,247.247 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.004969404079020023
2022-03-16 04:46:04,247.247 2829:tagger_caption_uni_pipeline_expanding.py:416    forward(): # of tokens = 577
2022-03-16 04:46:04,248.248 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 04:46:04,249.249 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'ferry', 'boat', 'in', 'a', 'large', 'canal', 'and', 'large', 'buildings', '[MASK]', 'sobs', 'the', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 04:46:04,267.267 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'boat', 'sky', 'water', 'city', 'bridge', 'window', 'tower', 'river', 'skyscraper', 'flag', 'wake', 'person', 'cloud', 'wave', 'wall', 'crane', 'tree', 'ship', '[UNK]', 'train', 'pillar', 'dock', 'sign', 'arch', 'antenna', 'roof', 'cabin', 'top', 'harbor', 'structure', 'bus', 'bottom', 'pole', 'dome', 'ripple', 'reflection', 'spire', 'rope', 'car', 'shore', 'walkway', 'column', 'man', 'railing', 'name', 'ramp', 'door', 'base', 'mast']
2022-03-16 04:46:20,520.520 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['city', 'water', 'building', 'large', 'river', 'person', 'bridge', 'window', 'sky', 'bus', 'boat', 'canal', 'shore', 'ferry', 'crane', 'skyscraper']
/tmp/code/src/qd/mask/solver/optimization.py:186: UserWarning: This overload of add_ is deprecated:
	add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
	add_(Tensor other, *, Number alpha) (Triggered internally at  /pytorch/torch/csrc/utils/python_arg_parser.cpp:882.)
  exp_avg.mul_(beta1).add_(1.0 - beta1, grad)
2022-03-16 04:48:55,989.989 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:22:05  iter: 100  speed: 191.3 images/sec  total_norm: 181.1051 (196.3794)  loss: 212.3282 (214.7145)  masked_loss: 4.9153 (5.3315)  tag_loss: 207.9742 (209.3830)  time: 1.4309 (1.4314)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4260 (1.4265)  lr: 0.000100  max mem: 26307
2022-03-16 04:48:56,351.351 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.1428571492433548
2022-03-16 04:48:56,351.351 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 198.2447967529297
2022-03-16 04:48:56,351.351 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.85773849487305
2022-03-16 04:48:58,816.816 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0074226646684110165
2022-03-16 04:48:58,816.816 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 04:48:58,816.816 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'todd', '##ler', 'wearing', 'a', 'large', 'floppy', 'hat', 'playing', 'in', 'the', 'sand', '[MASK]', '[MASK]', 'beach', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 04:48:58,832.832 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sand', 'sky', 'foot', 'shirt', 'hand', 'handle', 'girl', 'child', 'blanket', 'hat', 'bucket', 'beach', 'leg', '[UNK]', 'person', 'baby', 'hair', 'toy', 'head', 'short', 'woman', 'boy', 'top', 'spoon', 'towel', 'flower', 'face', 'ground', 'cup', 'water', 'ocean', 'shovel', 'brush', 'dress', 'cloth', 'lid', 'bag', 'hole', 'arm', 'man', 'bed', 'container', 'food', 'chair', 'ball', 'fork', 'family', 'strap', 'toe', 'knee']
2022-03-16 04:49:14,775.775 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'large', 'woman', 'short', 'hair', 'girl', 'child', 'foot', 'baby', 'beach', 'sky', 'shirt', 'leg', 'object', 'handle', 'sand', 'hat', 'flower', 'blanket', 'toy', 'fork', 'towel', 'bucket', 'spoon', 'bikini', 'floppy']
2022-03-16 04:51:38,600.600 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:10:26  iter: 200  speed: 314.9 images/sec  total_norm: 142.2078 (147.6038)  loss: 190.7610 (191.1786)  masked_loss: 4.1353 (4.1499)  tag_loss: 186.9331 (187.0288)  time: 1.4345 (1.6261)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4293 (1.6210)  lr: 0.000100  max mem: 26307
2022-03-16 04:51:38,960.960 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.37142857909202576
2022-03-16 04:51:38,960.960 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 187.27215576171875
2022-03-16 04:51:38,960.960 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.15105438232422
2022-03-16 04:51:41,492.492 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.008562282659113407
2022-03-16 04:51:41,492.492 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 04:51:41,493.493 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', '[MASK]', 'steam', 'locomotive', 'approaching', 'and', 'about', 'to', 'do', 'through', 'an', 'under', '##pass', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 04:51:41,508.508 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['train', 'track', 'engine', 'bush', 'number', 'sky', 'tree', 'gravel', 'car', 'hill', 'ground', 'smoke', 'front', 'mountain', 'steam', 'railroad', '[UNK]', 'wheel', 'building', 'roof', 'window', 'grass', 'pole', 'plant', 'chimney', 'bumper', 'person', 'house', 'photo', 'hillside', 'sign', 'light', 'platform', 'wall', 'man', 'conductor', 'bridge', 'locomotive', 'sidewalk', 'background', 'fence', 'rock', 'post', 'flower', 'snow', 'bell', 'box', 'door', 'station', 'cloud']
2022-03-16 04:51:57,499.499 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'house', 'number', 'old', 'building', 'car', 'ground', 'rock', 'track', 'wall', 'hill', 'mountain', 'engine', 'train', 'tree', 'sign', 'sky', 'shadow', 'wheel', 'steam', 'smoke', 'bush', 'blind', 'locomotive', 'approaching', 'gravel', 'hillside']
2022-03-16 04:54:21,136.136 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:42:56  iter: 300  speed: 315.0 images/sec  total_norm: 137.6106 (141.0283)  loss: 183.0483 (182.1217)  masked_loss: 3.8971 (3.9055)  tag_loss: 179.2304 (178.2163)  time: 1.4334 (1.6253)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4282 (1.6202)  lr: 0.000100  max mem: 26307
2022-03-16 04:54:21,499.499 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.3142857253551483
2022-03-16 04:54:21,500.500 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 183.96987915039062
2022-03-16 04:54:21,500.500 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.22936820983887
2022-03-16 04:54:24,058.058 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.009669368155300617
2022-03-16 04:54:24,059.059 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 04:54:24,059.059 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'little', 'boy', 'playing', 'baseball', '[MASK]', '[MASK]', 'swing', 'his', 'bat', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 04:54:24,074.074 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'helmet', 'fence', 'line', 'man', '[UNK]', 'person', 'grass', 'shoe', 'game', 'catcher', 'field', 'bat', 'dirt', 'baseball', 'short', 'boy', 'umpire', 'woman', 'hand', 'ground', 'glove', 'pole', 'mask', 'hat', 'batter', 'plate', 'sign', 'ball', 'player', 'uniform', 'tree', 'leg', 'cap', 'shadow', 'sky', 'jean', 'jersey', 'head', 'girl', 'belt', 'camera', 'hair', 'jacket', 'pad', 'bag', 'sunglasses', 'sock', 'bench', 'child']
2022-03-16 04:54:40,041.041 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'game', 'little', 'line', 'player', 'woman', 'short', 'field', 'person', 'boy', 'baseball', 'sign', 'shirt', 'grass', 'hat', 'cap', 'uniform', 'pole', 'dirt', 'bat', 'mask', 'fence', 'helmet', 'shoe', 'catcher', 'glove', 'umpire']
2022-03-16 04:57:03,554.554 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:57:15  iter: 400  speed: 315.2 images/sec  total_norm: 135.6639 (141.4707)  loss: 183.5033 (182.2539)  masked_loss: 3.6852 (3.6802)  tag_loss: 179.7325 (178.5737)  time: 1.4345 (1.6242)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4295 (1.6191)  lr: 0.000099  max mem: 26307
2022-03-16 04:57:03,915.915 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.15625
2022-03-16 04:57:03,915.915 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 169.03848266601562
2022-03-16 04:57:03,916.916 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.91149139404297
2022-03-16 04:57:06,492.492 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.010557741858065128
2022-03-16 04:57:06,492.492 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 04:57:06,492.492 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'some', 'old', '[MASK]', 'go', 'for', 'drinks', 'at', 'a', 'pub', '##bery', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 04:57:06,508.508 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hat', 'glass', 'woman', 'person', 'table', 'man', 'hair', 'wall', 'shirt', 'light', 'head', 'cap', 'jacket', 'ceiling', 'glasses', 'wine', 'archway', 'cup', '[UNK]', 'hand', 'face', 'bottle', 'bowl', 'group', 'arch', 'plate', 'window', 'sweater', 'building', 'picture', 'chair', 'food', 'pitcher', 'room', 'jean', 'candle', 'bar', 'coat', 'lamp', 'napkin', 'ear', 'purse', 'sign', 'suit', 'vase', 'lady', 'basket', 'paper', 'door', 'water']
2022-03-16 04:57:22,568.568 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'number', 'old', 'room', 'building', 'light', 'woman', 'hair', 'person', 'table', 'wall', 'glass', 'paper', 'sign', 'jean', 'shirt', 'picture', 'wine', 'speaker', 'ceiling', 'hat', 'cap', 'jacket', 'pen', 'glasses', 'pitcher', 'pub', 'lid', 'vase', 'napkin', 'archway']
2022-03-16 04:59:45,987.987 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:04:43  iter: 500  speed: 315.2 images/sec  total_norm: 133.8038 (137.0907)  loss: 180.4330 (180.3936)  masked_loss: 3.6690 (3.7253)  tag_loss: 176.5965 (176.6683)  time: 1.4343 (1.6244)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4290 (1.6192)  lr: 0.000099  max mem: 26307
2022-03-16 04:59:46,348.348 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.42105263471603394
2022-03-16 04:59:46,349.349 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 178.7808074951172
2022-03-16 04:59:46,349.349 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.73734283447266
2022-03-16 04:59:48,973.973 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.010883732698857784
2022-03-16 04:59:48,974.974 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 04:59:48,974.974 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'outside', 'seen', 'of', 'many', 'colorful', 'open', '[MASK]', '##s', 'with', 'a', 'huge', 'multi', 'colored', 'umbrella', '[MASK]', 'in', 'front', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 04:59:48,989.989 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['umbrella', 'person', 'woman', 'man', 'building', 'shirt', 'crowd', 'tree', 'hat', 'hair', '[UNK]', 'head', 'beach', 'sunglasses', 'pole', 'bag', 'sky', 'chair', 'stripe', 'short', 'background', 'top', 'purse', 'roof', 'tent', 'arm', 'market', 'towel', 'girl', 'hand', 'wall', 'sign', 'dress', 'water', 'house', 'jacket', 'flag', 'cap', 'grass', 'large', 'face', 'fence', 'window', 'child', 'table', 'next', 'lady', 'glasses', 'backpack', 'couple']
2022-03-16 05:00:04,956.956 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['many', 'head', 'man', 'building', 'open', 'front', 'woman', 'hair', 'person', 'boy', 'beach', 'shirt', 'huge', 'crowd', 'multi', 'hat', 'tent', 'umbrella', 'colorful']
2022-03-16 05:02:28,541.541 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:08:58  iter: 600  speed: 315.0 images/sec  total_norm: 131.7876 (134.7105)  loss: 176.1494 (175.0405)  masked_loss: 3.4661 (3.4510)  tag_loss: 172.4846 (171.5895)  time: 1.4347 (1.6255)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4293 (1.6202)  lr: 0.000099  max mem: 26307
2022-03-16 05:02:28,901.901 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.39393940567970276
2022-03-16 05:02:28,902.902 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 157.16656494140625
2022-03-16 05:02:28,902.902 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.56050981794085
2022-03-16 05:02:31,550.550 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.011422410607337952
2022-03-16 05:02:31,550.550 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:02:31,550.550 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'batter', 'musique', 'catcher', 'and', '[MASK]', 'in', 'a', 'baseball', 'game', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:02:31,565.565 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'shirt', '[UNK]', 'bat', 'helmet', 'shoe', 'field', 'person', 'player', 'baseball', 'roof', 'grass', 'game', 'sky', 'batter', 'glove', 'fence', 'dirt', 'catcher', 'uniform', 'building', 'hat', 'belt', 'leg', 'line', 'hand', 'pole', 'jacket', 'plate', 'tree', 'mask', 'stadium', 'ball', 'ground', 'umpire', 'home', 'bench', 'jersey', 'shadow', 'head', 'sock', 'camera', 'short', 'cap', 'woman', 'window', 'arm', 'sign', 'crowd', 'wall']
2022-03-16 05:02:47,498.498 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'game', 'building', 'player', 'field', 'ground', 'person', 'stand', 'baseball', 'shirt', 'jersey', 'leg', 'roof', 'plate', 'grass', 'belt', 'hat', 'cap', 'uniform', 'jacket', 'dirt', 'bat', 'mask', 'helmet', 'shoe', 'catcher', 'glove', 'umpire', 'spectator', 'batter']
2022-03-16 05:05:11,492.492 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:11:51  iter: 700  speed: 314.2 images/sec  total_norm: 133.7510 (140.6465)  loss: 176.2631 (177.3342)  masked_loss: 3.2643 (3.2633)  tag_loss: 172.8288 (174.0709)  time: 1.4354 (1.6295)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4305 (1.6245)  lr: 0.000099  max mem: 26307
2022-03-16 05:05:11,853.853 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.38235294818878174
2022-03-16 05:05:11,853.853 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 160.90536499023438
2022-03-16 05:05:11,853.853 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.29880237579346
2022-03-16 05:05:14,540.540 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01174083910882473
2022-03-16 05:05:14,541.541 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:05:14,541.541 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'children', '[MASK]', 'around', 'a', 'table', 'with', 'a', 'large', 'cake', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:05:14,556.556 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', 'shirt', 'hair', 'cake', 'plate', 'man', '[UNK]', 'child', 'glass', 'woman', 'boy', 'person', 'head', 'girl', 'hat', 'hand', 'bowl', 'wall', 'window', 'sweater', 'cup', 'fork', 'knife', 'food', 'flower', 'beard', 'tray', 'chair', 'napkin', 'group', 'face', 'container', 'candle', 'baby', 'wine', 'picture', 'necklace', 'lamp', 'spoon', 'cap', 'family', 'glasses', 'kid', 'jacket', 'bag', 'lid', 'dish', 'ear', 'dessert', 'jean']
2022-03-16 05:05:30,610.610 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'group', 'hand', 'large', 'door', 'woman', 'cup', 'hair', 'girl', 'person', 'child', 'table', 'wall', 'food', 'boy', 'shirt', 'handle', 'plate', 'knife', 'hat', 'cap', 'flower', 'glasses', 'fork', 'cake', 'beard', 'lamp', 'spoon']
2022-03-16 05:07:54,307.307 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:13:08  iter: 800  speed: 314.5 images/sec  total_norm: 134.2146 (138.4319)  loss: 170.5890 (174.6861)  masked_loss: 3.0440 (3.1001)  tag_loss: 168.2145 (171.5861)  time: 1.4348 (1.6282)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4299 (1.6230)  lr: 0.000099  max mem: 26307
2022-03-16 05:07:54,667.667 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.42424243688583374
2022-03-16 05:07:54,667.667 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 178.43075561523438
2022-03-16 05:07:54,668.668 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.56159040662978
2022-03-16 05:07:57,420.420 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.012304706498980522
2022-03-16 05:07:57,420.420 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:07:57,420.420 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'is', 'flying', 'a', 'kite', 'outside', '[MASK]', 'with', 'others', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:07:57,435.435 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'sky', 'cloud', 'grass', '[UNK]', 'park', 'field', 'person', 'head', 'man', 'building', 'leg', 'pole', 'ground', 'trunk', 'branch', 'bush', 'fence', 'tail', 'jacket', 'shadow', 'shirt', 'road', 'bench', 'post', 'roof', 'shoe', 'sign', 'hair', 'kite', 'house', 'next', 'wheel', 'hand', 'ear', 'car', 'face', 'horse', 'hat', 'street', 'jean', 'rock', 'front', 'light', 'hill', 'dirt', 'wall', 'window', 'woman', 'sidewalk']
2022-03-16 05:08:13,444.444 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'park', 'hair', 'outside', 'person', 'child', 'arm', 'boy', 'tree', 'wood', 'sky', 'shirt', 'leg', 'shadow', 'grass', 'bush', 'hat', 'cloud', 'jacket', 'sweater', 'kite', 'stump']
2022-03-16 05:10:37,139.139 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:13:32  iter: 900  speed: 314.4 images/sec  total_norm: 137.5071 (138.8992)  loss: 174.5231 (173.8569)  masked_loss: 3.0852 (3.0507)  tag_loss: 172.3848 (170.8062)  time: 1.4357 (1.6283)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4305 (1.6232)  lr: 0.000099  max mem: 26307
2022-03-16 05:10:37,504.504 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.3030303120613098
2022-03-16 05:10:37,505.505 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 194.32888793945312
2022-03-16 05:10:37,505.505 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.19998245239258
2022-03-16 05:10:40,240.240 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.012654680758714676
2022-03-16 05:10:40,240.240 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:10:40,240.240 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'group', '[MASK]', 'surf', '##board', 'are', 'stacked', 'in', 'a', 'shed', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:10:40,256.256 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'window', 'roof', 'sky', 'tree', '[UNK]', 'sign', 'street', 'person', 'shirt', 'house', 'wall', 'man', 'sidewalk', 'woman', 'road', 'hair', 'door', 'ground', 'bag', 'store', 'wire', 'head', 'motorcycle', 'pole', 'umbrella', 'bike', 'shop', 'wheel', 'shoe', 'plant', 'light', 'tire', 'hand', 'hat', 'jacket', 'bicycle', 'short', 'skirt', 'trash', 'line', 'basket', 'dress', 'car', 'leg', 'cart', 'clothes', 'chair', 'jean', 'boy']
2022-03-16 05:10:56,196.196 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'group', 'hand', 'face', 'building', 'woman', 'short', 'ground', 'board', 'hair', 'person', 'table', 'arm', 'boy', 'window', 'tree', 'sign', 'sky', 'shirt', 'roof', 'bag', 'flag', 'bottle', 'bin', 'brush', 'shed', 'sidewalk', 'jug', 'crate']
03-16 05:12:19.019 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 05:12:19.019 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 05:12:20.144 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 05:13:19,843.843 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:13:10  iter: 1000  speed: 314.7 images/sec  total_norm: 132.1233 (136.8789)  loss: 173.8704 (174.4273)  masked_loss: 2.8622 (2.8671)  tag_loss: 170.4219 (171.5602)  time: 1.4341 (1.6270)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4289 (1.6218)  lr: 0.000098  max mem: 26307
2022-03-16 05:13:20,204.204 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-16 05:13:20,205.205 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 164.07948303222656
2022-03-16 05:13:20,205.205 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.00378071178089
2022-03-16 05:13:22,982.982 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.012677717953920364
2022-03-16 05:13:22,983.983 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:13:22,983.983 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'and', 'orange', 'bus', '相', 'street', 'next', 'to', 'trees', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:13:22,998.998 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'sky', 'bus', 'building', 'road', 'street', 'light', 'tire', 'pole', '[UNK]', 'windshield', 'tree', 'sign', 'wheel', 'stripe', 'line', 'roof', 'man', 'door', 'mirror', 'plate', 'sidewalk', 'front', 'car', 'logo', 'person', 'curb', 'license', 'bumper', 'driver', 'wall', 'letter', 'vest', 'number', 'traffic', 'fence', 'chimney', 'truck', 'van', 'jacket', 'writing', 'grill', 'cone', 'white', 'hat', 'helmet', 'arrow', 'back', 'city', 'shoe']
2022-03-16 05:13:39,004.004 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'number', 'next', 'white', 'road', 'front', 'street', 'light', 'window', 'tree', 'sign', 'sky', 'bus', 'roof', 'orange', 'wheel', 'mirror', 'pole', 'fence', 'sidewalk', 'tire', 'advertisement', 'stripe', 'vent', 'windshield', 'bumper']
2022-03-16 05:16:02,734.734 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:12:34  iter: 1100  speed: 314.3 images/sec  total_norm: 129.3981 (134.6015)  loss: 172.7981 (174.8679)  masked_loss: 2.8405 (2.8324)  tag_loss: 170.2540 (172.0356)  time: 1.4360 (1.6289)  data: 0.0001 (0.0005)  to_device: 0.0050 (0.0048)  time_gpu: 1.4309 (1.6236)  lr: 0.000098  max mem: 26307
2022-03-16 05:16:03,095.095 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.23529411852359772
2022-03-16 05:16:03,095.095 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 180.63967895507812
2022-03-16 05:16:03,095.095 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.03975868225098
2022-03-16 05:16:05,908.908 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.012720104306936264
2022-03-16 05:16:05,909.909 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:16:05,909.909 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'professional', 'pitcher', 'on', 'the', 'mound', 'getting', '[MASK]', 'to', '[MASK]', 'the', 'ball', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:16:05,924.924 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'belt', 'field', 'jersey', 'baseball', 'uniform', '[UNK]', 'glove', 'shirt', 'shoe', 'number', 'man', 'dirt', 'leg', 'head', 'player', 'hand', 'hat', 'fence', 'ball', 'cap', 'arm', 'ground', 'name', 'wall', 'helmet', 'back', 'face', 'logo', 'line', 'home', 'pole', 'person', 'mound', 'plate', 'base', 'stripe', 'letter', 'sign', 'hair', 'sock', 'tree', 'bat', 'game', 'shadow', 'pitcher', 'sleeve', 'shin', 'pitch', 'ear']
2022-03-16 05:16:21,907.907 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'back', 'head', 'man', 'name', 'hand', 'number', 'line', 'player', 'field', 'ground', 'professional', 'person', 'arm', 'ready', 'baseball', 'ball', 'letter', 'shirt', 'jersey', 'leg', 'bag', 'grass', 'belt', 'cap', 'uniform', 'dirt', 'pitcher', 'fence', 'shoe', 'mound', 'cooler', 'glove', 'sunglasses']
2022-03-16 05:18:45,558.558 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:11:33  iter: 1200  speed: 314.5 images/sec  total_norm: 132.7703 (136.9227)  loss: 172.2552 (172.7141)  masked_loss: 2.6881 (2.7103)  tag_loss: 169.2189 (170.0038)  time: 1.4346 (1.6283)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4297 (1.6233)  lr: 0.000098  max mem: 26307
2022-03-16 05:18:45,924.924 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4000000059604645
2022-03-16 05:18:45,924.924 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 181.54107666015625
2022-03-16 05:18:45,924.924 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 67.81044123722957
2022-03-16 05:18:48,783.783 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01294653583317995
2022-03-16 05:18:48,783.783 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:18:48,784.784 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'this', 'is', 'an', 'image', 'of', '[MASK]', '[MASK]', 'in', 'a', 'motorcycle', 'side', 'car', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:18:48,799.799 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['dog', 'ear', 'head', 'motorcycle', 'car', '[UNK]', 'bike', 'tire', 'man', 'seat', 'window', 'sign', 'eye', 'rack', 'building', 'mirror', 'light', 'wheel', 'wall', 'shirt', 'door', 'person', 'collar', 'pole', 'windshield', 'hand', 'handle', 'nose', 'face', 'truck', 'hair', 'jacket', 'hood', 'mouth', 'bar', 'tag', 'shadow', 'fender', 'sky', 'tree', 'chain', 'paw', 'plate', 'vehicle', 'woman', 'roof', 'jean', 'ground', 'leg', 'reflection']
2022-03-16 05:19:04,876.876 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'side', 'building', 'car', 'mouth', 'wall', 'seat', 'writing', 'eye', 'window', 'letter', 'sign', 'image', 'gas', 'dog', 'nose', 'ear', 'tank', 'handle', 'mirror', 'pole', 'hood', 'bike', 'logo', 'pipe', 'motorcycle', 'tire', 'pillar', 'exhaust', 'fender', 'windshield']
2022-03-16 05:21:28,633.633 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:10:28  iter: 1300  speed: 314.0 images/sec  total_norm: 133.8144 (138.2303)  loss: 173.9276 (173.6177)  masked_loss: 2.7259 (2.7298)  tag_loss: 170.3922 (170.8880)  time: 1.4365 (1.6306)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4315 (1.6256)  lr: 0.000098  max mem: 26307
2022-03-16 05:21:28,993.993 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4166666567325592
2022-03-16 05:21:28,993.993 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 177.44674682617188
2022-03-16 05:21:28,994.994 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 67.68502698625836
2022-03-16 05:21:31,923.923 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.013217877596616745
2022-03-16 05:21:31,924.924 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:21:31,924.924 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'lady', 'bent', '[MASK]', 'with', '[MASK]', 'tennis', 'rack', '##et', 'while', 'another', 'girl', 'looks', 'down', 'court', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:21:31,940.940 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'tennis', 'court', 'woman', 'shoe', 'shirt', 'short', 'hand', 'leg', 'hair', 'line', 'head', 'player', 'wall', 'letter', 'arm', 'logo', 'ground', 'band', 'skirt', 'top', 'handle', 'girl', 'outfit', 'ball', 'tank', 'ponytail', 'sock', 'person', 'face', 'necklace', 'curtain', 'sign', 'dress', 'man', 'banner', 'string', 'mouth', 'hat', 'stand', 'female', 'cap', 'advertisement', 'ear', 'wrist', 'shadow', 'uniform', 'stripe', 'knee', 'spectator']
2022-03-16 05:21:47,909.909 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'line', 'band', 'top', 'player', 'woman', 'court', 'short', 'hair', 'girl', 'mouth', 'wall', 'arm', 'lady', 'letter', 'shirt', 'leg', 'ear', 'tank', 'handle', 'tennis', 'bent', 'skirt', 'shoe', 'outfit', 'ponytail']
2022-03-16 05:24:11,593.593 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:09:05  iter: 1400  speed: 314.2 images/sec  total_norm: 127.5011 (130.7926)  loss: 171.3775 (172.3323)  masked_loss: 2.6194 (2.6451)  tag_loss: 168.6091 (169.6872)  time: 1.4354 (1.6297)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4301 (1.6245)  lr: 0.000098  max mem: 26307
2022-03-16 05:24:11,954.954 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.3333333432674408
2022-03-16 05:24:11,954.954 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 202.48736572265625
2022-03-16 05:24:11,954.954 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 67.4316192626953
2022-03-16 05:24:14,878.878 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.013179749250411987
2022-03-16 05:24:14,878.878 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:24:14,879.879 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'kitchen', 'area', 'is', 'clean', 'and', 'ready', 'to', 'use', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:24:14,894.894 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['chair', 'wall', 'window', 'table', 'television', 'floor', 'room', 'shelf', 'drawer', '[UNK]', 'cushion', 'book', 'cabinet', 'microwave', 'ceiling', 'door', 'handle', 'picture', 'kitchen', 'desk', 'light', 'leg', 'rug', 'building', 'knob', 'stove', 'stool', 'top', 'couch', 'oven', 'box', 'bed', 'monitor', 'coffee', 'can', 'seat', 'lamp', 'living', 'carpet', 'bottle', 'bowl', 'pot', 'pillow', 'basket', 'screen', 'dresser', 'blind', 'lid', 'fireplace', 'paper']
2022-03-16 05:24:30,829.829 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'area', 'room', 'light', 'television', 'design', 'floor', 'bed', 'table', 'wall', 'ready', 'glass', 'chair', 'paper', 'window', 'kitchen', 'leg', 'clean', 'handle', 'cabinet', 'bottle', 'ceiling', 'blind', 'pot', 'towel', 'shelf', 'trash', 'lid', 'garbage', 'drawer', 'tile', 'stove', 'knob', 'oven', 'microwave', 'rug', 'cushion']
2022-03-16 05:26:54,578.578 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:07:32  iter: 1500  speed: 314.1 images/sec  total_norm: 128.0446 (131.7554)  loss: 171.2310 (171.7803)  masked_loss: 2.4614 (2.5328)  tag_loss: 168.9170 (169.2475)  time: 1.4349 (1.6299)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4299 (1.6248)  lr: 0.000098  max mem: 26307
2022-03-16 05:26:54,941.941 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-16 05:26:54,941.941 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 153.58511352539062
2022-03-16 05:26:54,941.941 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 67.49695444107056
2022-03-16 05:26:57,910.910 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.013360531069338322
2022-03-16 05:26:57,910.910 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:26:57,911.911 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'this', 'double', '[MASK]', 'bus', '[MASK]', 'along', 'the', 'streets', 'in', 'madrid', ',', 'spain', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:26:57,926.926 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'bus', 'building', 'street', 'road', 'tire', 'line', 'wheel', 'sign', 'balcony', 'car', '[UNK]', 'advertisement', 'person', 'sidewalk', 'door', 'windshield', 'light', 'front', 'license', 'plate', 'letter', 'decker', 'logo', 'city', 'woman', 'man', 'pole', 'sky', 'driver', 'double', 'curb', 'deck', 'mirror', 'number', 'top', 'railing', 'ad', 'passenger', 'word', 'hair', 'shirt', 'red', 'tree', 'traffic', 'van', 'fence', 'wall', 'picture', 'back']
2022-03-16 05:27:13,901.901 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'back', 'line', 'building', 'road', 'street', 'light', 'car', 'person', 'double', 'window', 'sign', 'bus', 'plate', 'wheel', 'license', 'balcony', 'tire', 'railing', 'vent', 'decker']
2022-03-16 05:29:37,590.590 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:05:51  iter: 1600  speed: 314.1 images/sec  total_norm: 129.1474 (132.6272)  loss: 175.4478 (172.9883)  masked_loss: 2.5299 (2.5677)  tag_loss: 172.7031 (170.4205)  time: 1.4346 (1.6301)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4294 (1.6249)  lr: 0.000098  max mem: 26307
2022-03-16 05:29:37,952.952 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-16 05:29:37,952.952 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 171.15374755859375
2022-03-16 05:29:37,952.952 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 67.51970717486213
2022-03-16 05:29:40,973.973 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.013498584739863873
2022-03-16 05:29:40,973.973 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:29:40,974.974 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'gi', '##raf', '##fe', '##s', 'standing', '[MASK]', 'the', 'side', 'of', 'a', 'grassy', 'hill', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:29:40,989.989 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'neck', 'head', 'grass', 'leg', 'field', 'tree', 'tail', 'bush', 'branch', 'ear', 'mane', 'ground', 'trunk', 'face', 'spot', 'wild', 'horn', 'hair', 'couple', 'zebra', 'group', 'grassy', 'shadow', 'next', 'bird', 'body', 'other', 'background', 'animal', 'sky', 'area', 'standing', 'tall', 'baby', 'brush', 'large', 'stripe', 'plain', 'small', 'dirt', 'lush', 'grazing', 'open', 'mouth', 'deer', 'top', 'rock', 'brown', 'stick']
2022-03-16 05:29:56,977.977 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'side', 'field', 'ground', 'hill', 'neck', 'tree', 'branch', 'spot', 'leg', 'grass', 'tail', 'bush', 'grassy', 'mane']
2022-03-16 05:32:21,062.062 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:04:20  iter: 1700  speed: 313.2 images/sec  total_norm: 130.4102 (133.6316)  loss: 171.8849 (171.3629)  masked_loss: 2.4974 (2.4935)  tag_loss: 169.2585 (168.8694)  time: 1.4351 (1.6347)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4304 (1.6296)  lr: 0.000097  max mem: 26307
2022-03-16 05:32:21,424.424 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-16 05:32:21,424.424 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 173.7703857421875
2022-03-16 05:32:21,424.424 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 67.63988071017795
2022-03-16 05:32:24,476.476 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.013531708158552647
2022-03-16 05:32:24,476.476 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:32:24,477.477 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'shirt', 'less', 'man', 'with', 'tennis', 'rack', '##et', 'walking', 'on', 'a', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:32:24,492.492 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', '[UNK]', 'tennis', 'hand', 'short', 'hat', 'leg', 'wall', 'court', 'head', 'cap', 'sock', 'shoe', 'fence', 'shadow', 'logo', 'sign', 'arm', 'ground', 'banner', 'letter', 'line', 'band', 'ball', 'face', 'handle', 'wrist', 'hair', 'advertisement', 'player', 'ear', 'back', 'tattoo', 'stripe', 'background', 'top', 'shirt', 'person', 'net', 'foot', 'spectator', 'sunglasses', 'pole', 'mouth', 'chair', 'watch', 'bracelet', 'string', 'board', 'knee']
2022-03-16 05:32:40,513.513 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'court', 'short', 'wall', 'arm', 'sign', 'leg', 'ear', 'tennis', 'net', 'hat', 'cap', 'pole', 'wrist', 'logo', 'fence', 'banner', 'bracelet', 'sock']
2022-03-16 05:35:04,179.179 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:02:28  iter: 1800  speed: 313.9 images/sec  total_norm: 129.6942 (132.7102)  loss: 167.9482 (166.3510)  masked_loss: 2.4083 (2.4409)  tag_loss: 165.8114 (163.9101)  time: 1.4350 (1.6312)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4297 (1.6260)  lr: 0.000097  max mem: 26307
2022-03-16 05:35:04,539.539 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4000000059604645
2022-03-16 05:35:04,539.539 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 153.75851440429688
2022-03-16 05:35:04,540.540 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 67.99534807707134
2022-03-16 05:35:07,619.619 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.013688676990568638
2022-03-16 05:35:07,619.619 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:35:07,619.619 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'young', 'man', 'sits', 'on', '[MASK]', 'bench', '[MASK]', 'the', 'snow', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:35:07,634.634 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'bench', 'snow', 'building', 'tree', 'ground', 'sidewalk', 'light', 'shadow', 'car', 'pole', 'man', 'road', 'house', 'street', 'roof', 'hat', 'jacket', 'window', '[UNK]', 'head', 'coat', 'leg', 'person', 'path', 'curb', 'hand', 'lamp', 'shoe', 'grass', 'hair', 'chimney', 'bush', 'arm', 'park', 'post', 'line', 'can', 'cap', 'trunk', 'back', 'sign', 'bag', 'branch', 'foot', 'woman', 'puddle', 'trash', 'town', 'truck']
2022-03-16 05:35:23,597.597 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'house', 'building', 'road', 'street', 'young', 'light', 'car', 'ground', 'window', 'tree', 'sky', 'leg', 'roof', 'snow', 'shadow', 'grass', 'hat', 'jacket', 'bench', 'porch', 'sidewalk']
2022-03-16 05:37:47,396.396 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:00:35  iter: 1900  speed: 313.7 images/sec  total_norm: 127.9392 (128.4986)  loss: 168.6987 (169.7907)  masked_loss: 2.3185 (2.4161)  tag_loss: 166.4492 (167.3746)  time: 1.4363 (1.6322)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4312 (1.6271)  lr: 0.000097  max mem: 26307
2022-03-16 05:37:47,759.759 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-16 05:37:47,760.760 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 155.89642333984375
2022-03-16 05:37:47,760.760 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 67.93155250549316
2022-03-16 05:37:50,876.876 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.013802473433315754
2022-03-16 05:37:50,876.876 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:37:50,876.876 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', 'old', 'fashioned', 'train', 'engine', 'parked', 'with', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:37:50,892.892 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['train', 'ground', 'building', 'roof', 'sky', 'window', 'track', 'man', '[UNK]', 'light', 'smoke', 'door', 'pole', 'person', 'shirt', 'tree', 'bumper', 'station', 'chimney', 'car', 'wall', 'sign', 'engine', 'vent', 'front', 'barn', 'hat', 'shoe', 'steam', 'jean', 'wheel', 'wood', 'fence', 'jacket', 'gravel', 'head', 'pipe', 'table', 'stack', 'bench', 'number', 'hair', 'boy', 'bag', 'child', 'platform', 'box', 'coat', 'cloud', 'black']
2022-03-16 05:38:06,903.903 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'american', 'family', 'man', 'group', 'old', 'building', 'front', 'light', 'woman', 'ground', 'hair', 'person', 'table', 'boy', 'engine', 'paper', 'window', 'train', 'tree', 'sky', 'shirt', 'roof', 'tank', 'flag', 'smoke', 'jacket', 'bench', 'shed', 'barn', 'stack', 'ladder', 'picnic', 'tire', 'fashioned', 'vent']
2022-03-16 05:40:30,491.491 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:58:32  iter: 2000  speed: 313.9 images/sec  total_norm: 128.3105 (131.0135)  loss: 166.7830 (165.3335)  masked_loss: 2.3378 (2.3857)  tag_loss: 164.6639 (162.9477)  time: 1.4346 (1.6310)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4294 (1.6260)  lr: 0.000097  max mem: 26307
2022-03-16 05:40:30,853.853 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-16 05:40:30,853.853 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 150.54660034179688
2022-03-16 05:40:30,853.853 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 67.90539042154948
2022-03-16 05:40:34,006.006 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.013828023336827755
2022-03-16 05:40:34,007.007 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:40:34,007.007 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'on', 'a', 'boat', '[MASK]', 'at', 'the', 'water', 'and', 'the', 'sun', '[MASK]', 'on', 'the', 'water', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:40:34,022.022 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['dog', 'water', 'sky', 'collar', 'head', 'ear', 'neck', 'land', 'sun', 'boat', 'tree', 'button', 'lake', 'reflection', 'hill', '[UNK]', 'body', 'mountain', 'nose', 'light', 'eye', 'leg', 'rope', 'view', 'window', 'back', 'tail', 'harness', 'wave', 'pole', 'arm', 'paw', 'face', 'cloud', 'snout', 'buckle', 'hair', 'beach', 'belt', 'spot', 'mouth', 'white', 'seat', 'background', 'car', 'shadow', 'person', 'bar', 'bolt', 'leash']
2022-03-16 05:40:49,996.996 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'water', 'light', 'hill', 'sun', 'neck', 'tree', 'sky', 'dog', 'boat', 'bell', 'nose', 'button', 'trunk', 'elbow', 'collar', 'harness', 'buckle']
03-16 05:42:20.245 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 05:42:20.245 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 05:42:21.210 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}]
2022-03-16 05:43:13,797.797 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:56:32  iter: 2100  speed: 313.5 images/sec  total_norm: 127.0578 (128.6500)  loss: 167.9351 (168.6626)  masked_loss: 2.3923 (2.4091)  tag_loss: 165.2934 (166.2535)  time: 1.4350 (1.6331)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4298 (1.6279)  lr: 0.000097  max mem: 26307
2022-03-16 05:43:14,159.159 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.3125
2022-03-16 05:43:14,159.159 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 150.48577880859375
2022-03-16 05:43:14,159.159 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.12507594715466
2022-03-16 05:43:17,299.299 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.013921589590609074
2022-03-16 05:43:17,299.299 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:43:17,299.299 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'elephant', 'in', 'the', 'zoo', 'is', 'pushing', 'against', 'the', 'tree', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:43:17,314.314 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'ground', 'rock', 'leg', 'elephant', 'wall', 'zoo', 'ear', 'trunk', 'tree', 'enclosure', 'head', 'shadow', 'tail', 'bush', '[UNK]', 'foot', 'log', 'plant', 'dirt', 'weed', 'boulder', 'water', 'stump', 'eye', 'road', 'next', 'mouth', 'fence', 'stone', 'standing', 'large', 'hole', 'stick', 'area', 'block', 'pole', 'other', 'branch', 'animal', 'body', 'back', 'baby', 'hill', 'leaf', 'wire', 'sky', 'post', 'paw', 'bear']
2022-03-16 05:43:33,356.356 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'water', 'ground', 'rock', 'wall', 'eye', 'tree', 'leg', 'ear', 'shadow', 'grass', 'tail', 'trunk', 'zoo', 'elephant', 'enclosure', 'stump']
2022-03-16 05:45:57,163.163 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:54:30  iter: 2200  speed: 313.4 images/sec  total_norm: 124.6061 (126.4061)  loss: 165.6890 (167.8841)  masked_loss: 2.3442 (2.3474)  tag_loss: 163.9249 (165.5366)  time: 1.4349 (1.6336)  data: 0.0001 (0.0005)  to_device: 0.0050 (0.0049)  time_gpu: 1.4297 (1.6282)  lr: 0.000097  max mem: 26307
2022-03-16 05:45:57,527.527 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4545454680919647
2022-03-16 05:45:57,527.527 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 177.9755401611328
2022-03-16 05:45:57,527.527 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 67.95332668138587
2022-03-16 05:46:00,698.698 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.013937680050730705
2022-03-16 05:46:00,698.698 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:46:00,699.699 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'gi', '##raf', '##fe', 'in', 'the', 'middle', 'of', 'a', 'small', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:46:00,714.714 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'sky', 'ground', '[UNK]', 'leg', 'head', 'neck', 'dirt', 'forest', 'branch', 'fence', 'bush', 'ear', 'stick', 'tail', 'log', 'rock', 'trunk', 'horn', 'wood', 'leaf', 'next', 'pole', 'eye', 'stump', 'field', 'spot', 'area', 'post', 'plant', 'grass', 'hill', 'face', 'animal', 'mountain', 'tall', 'standing', 'bird', 'zoo', 'large', 'hair', 'body', 'mane', 'mouth', 'brown', 'grassy', 'enclosure', 'small', 'walking', 'group']
2022-03-16 05:46:16,769.769 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'small', 'ground', 'middle', 'post', 'forest', 'neck', 'tree', 'wood', 'branch', 'sky', 'leg', 'bush', 'stick', 'dirt', 'elephant']
2022-03-16 05:48:40,316.316 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:52:18  iter: 2300  speed: 313.8 images/sec  total_norm: 125.1410 (128.4399)  loss: 166.4142 (167.0106)  masked_loss: 2.3073 (2.3865)  tag_loss: 163.6856 (164.6240)  time: 1.4332 (1.6315)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4279 (1.6262)  lr: 0.000097  max mem: 26307
2022-03-16 05:48:40,677.677 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-16 05:48:40,677.677 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 189.9331512451172
2022-03-16 05:48:40,677.677 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 67.95998859405518
2022-03-16 05:48:43,896.896 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.013976288959383965
2022-03-16 05:48:43,896.896 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:48:43,897.897 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'skate', '##board', '##er', 'performing', 'a', 'stunt', '##chel', 'an', '[MASK]', 'area', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:48:43,912.912 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'man', 'building', 'step', 'shirt', 'stair', 'shoe', 'car', 'pole', 'sign', 'person', 'sidewalk', 'tree', 'street', 'boy', 'ground', 'light', 'van', 'hair', 'sky', 'hand', 'bench', 'arm', 'jean', 'wall', 'head', 'window', 'hat', 'railing', 'wheel', 'truck', 'rail', 'city', 'trick', 'road', 'leg', 'bag', 'jacket', 'park', 'lot', 'can', 'line', 'post', 'skate', 'tire', 'bicycle', 'ramp', 'board', 'pillar', 'roof']
2022-03-16 05:48:59,960.960 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'city', 'man', 'area', 'building', 'road', 'street', 'light', 'car', 'ground', 'wall', 'boy', 'van', 'window', 'step', 'sign', 'shirt', 'bus', 'urban', 'traffic', 'bag', 'truck', 'hat', 'pole', 'bench', 'fence', 'reflection', 'shoe', 'sidewalk', 'tire', 'umbrella', 'pillar', 'stunt', 'railing', 'stair']
2022-03-16 05:51:23,562.562 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:50:07  iter: 2400  speed: 313.6 images/sec  total_norm: 133.2892 (136.3614)  loss: 168.5715 (168.7346)  masked_loss: 2.3544 (2.3869)  tag_loss: 166.8105 (166.3478)  time: 1.4354 (1.6325)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4296 (1.6272)  lr: 0.000096  max mem: 26307
2022-03-16 05:51:23,929.929 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 05:51:23,929.929 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 135.15872192382812
2022-03-16 05:51:23,929.929 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.13902069091797
2022-03-16 05:51:27,179.179 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.014039664529263973
2022-03-16 05:51:27,179.179 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:51:27,179.179 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'snow', '##board', '##er', 'with', 'glasses', 'is', 'flying', 'through', '[MASK]', 'air', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:51:27,195.195 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'jacket', 'sky', 'glove', 'head', 'man', 'cloud', 'boot', 'leg', 'arm', 'helmet', 'hand', 'zipper', 'face', 'ski', 'coat', 'person', 'foot', 'snow', 'skier', 'board', 'air', 'pole', 'shoe', 'tree', 'ground', 'top', 'slope', 'logo', 'suit', 'knee', 'hill', 'black', 'mountain', 'stripe', 'hat', 'snowy', 'glasses', 'blue', 'yellow', 'strap', 'hood', 'design', 'white', 'jump', 'pine', 'hair', 'clothes', 'half', 'building']
2022-03-16 05:51:43,151.151 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'air', 'person', 'arm', 'foot', 'sky', 'leg', 'coat', 'cloud', 'jacket', 'boot', 'helmet', 'shoe', 'glove', 'zipper']
2022-03-16 05:54:06,777.777 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:47:51  iter: 2500  speed: 313.7 images/sec  total_norm: 124.3773 (127.4217)  loss: 166.9203 (166.5881)  masked_loss: 2.3137 (2.3667)  tag_loss: 164.1752 (164.2214)  time: 1.4337 (1.6322)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4285 (1.6271)  lr: 0.000096  max mem: 26307
2022-03-16 05:54:07,138.138 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-16 05:54:07,139.139 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.09341430664062
2022-03-16 05:54:07,139.139 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.21351271409254
2022-03-16 05:54:10,420.420 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.014027805998921394
2022-03-16 05:54:10,421.421 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:54:10,421.421 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'very', 'pretty', 'glass', 'holding', 'some', '[MASK]', 'pretty', 'flowers', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:54:10,436.436 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['flower', 'table', 'bouquet', 'leaf', 'vase', 'glass', 'rose', 'stem', 'base', 'background', '[UNK]', 'water', 'chair', 'wall', 'plate', 'cloth', 'light', 'napkin', 'person', 'bud', 'window', 'fork', 'spoon', 'shadow', 'bottom', 'handle', 'white', 'plant', 'knife', 'reflection', 'wine', 'design', 'shirt', 'berry', 'paper', 'couple', 'object', 'bowl', 'top', 'curtain', 'ribbon', 'green', 'floor', 'man', 'group', 'candle', 'next', 'woman', 'cup', 'wooden']
2022-03-16 05:54:26,435.435 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'light', 'table', 'glass', 'chair', 'pretty', 'background', 'flower', 'stem', 'vase', 'bouquet']
2022-03-16 05:56:50,141.141 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:45:38  iter: 2600  speed: 313.4 images/sec  total_norm: 125.8118 (127.3205)  loss: 170.5232 (171.8502)  masked_loss: 2.2644 (2.3067)  tag_loss: 167.7460 (169.5435)  time: 1.4339 (1.6336)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4288 (1.6285)  lr: 0.000096  max mem: 26307
2022-03-16 05:56:50,503.503 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 05:56:50,503.503 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 194.7392578125
2022-03-16 05:56:50,503.503 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.09430016411676
2022-03-16 05:56:53,857.857 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01414740364998579
2022-03-16 05:56:53,857.857 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:56:53,857.857 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'girl', 'smiling', 'and', 'holding', 'a', '##vati', '[MASK]', 'remote', 'in', 'her', 'hands', 'over', 'her', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:56:53,872.872 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'shirt', 'wall', 'girl', 'eye', 'fireplace', 'arm', 'hand', 'mantle', 'head', 'controller', 'face', 'candle', 'nose', 'lamp', 'remote', 'picture', 'ear', '[UNK]', 'game', 'child', 'shelf', 'mouth', 'smile', 'sleeve', 'floor', 'book', 'strap', 'boy', 'curtain', 'teeth', 'room', 'couch', 'table', 'door', 'frame', 'wii', 'jean', 'chair', 'mirror', 'window', 'shade', 'bracelet', 'light', 'video', 'young', 'wrist', 'television', 'flower', 'toy']
2022-03-16 05:57:09,942.942 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'game', 'face', 'short', 'hair', 'girl', 'video', 'wall', 'arm', 'smile', 'eye', 'shirt', 'picture', 'nose', 'mirror', 'smiling', 'flower', 'remote', 'wrist', 'lamp', 'shelf', 'candle', 'fireplace', 'strap']
2022-03-16 05:59:33,618.618 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:43:25  iter: 2700  speed: 313.2 images/sec  total_norm: 125.1971 (127.6473)  loss: 163.8296 (164.3393)  masked_loss: 2.2841 (2.2669)  tag_loss: 161.7627 (162.0724)  time: 1.4349 (1.6348)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0050)  time_gpu: 1.4297 (1.6296)  lr: 0.000096  max mem: 26307
2022-03-16 05:59:33,980.980 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5151515007019043
2022-03-16 05:59:33,980.980 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 151.72967529296875
2022-03-16 05:59:33,980.980 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.14831297738212
2022-03-16 05:59:37,352.352 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.014310806058347225
2022-03-16 05:59:37,352.352 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 05:59:37,353.353 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'zebra', 'are', 'eating', '[MASK]', 'grass', 'underneath', 'an', 'umbrella', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 05:59:37,368.368 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['zebra', 'leg', 'ground', 'fence', 'mane', 'grass', 'head', 'shadow', 'ear', 'stripe', '[UNK]', 'tail', 'neck', 'eye', 'wall', 'enclosure', 'nose', 'dirt', 'mouth', 'zoo', 'pole', 'sand', 'tree', 'rock', 'post', 'body', 'building', 'field', 'pen', 'hay', 'next', 'log', 'trunk', 'other', 'shade', 'face', 'leaf', 'area', 'roof', 'couple', 'hair', 'standing', 'green', 'plant', 'bush', 'door', 'branch', 'spot', 'group', 'sky']
2022-03-16 05:59:53,261.261 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'ground', 'wall', 'plant', 'leg', 'roof', 'ear', 'shadow', 'grass', 'tail', 'pole', 'dirt', 'fence', 'umbrella', 'enclosure', 'stripe', 'mane', 'zebra']
2022-03-16 06:02:17,126.126 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:41:10  iter: 2800  speed: 313.1 images/sec  total_norm: 129.6093 (133.3876)  loss: 162.8348 (166.7582)  masked_loss: 2.2670 (2.2786)  tag_loss: 160.4984 (164.4796)  time: 1.4355 (1.6351)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0050)  time_gpu: 1.4302 (1.6299)  lr: 0.000096  max mem: 26307
2022-03-16 06:02:17,490.490 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-16 06:02:17,490.490 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 188.01812744140625
2022-03-16 06:02:17,490.490 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.09601908716662
2022-03-16 06:02:20,905.905 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.014494640752673149
2022-03-16 06:02:20,905.905 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:02:20,906.906 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'people', 'at', 'a', 'table', '[MASK]', 'a', 'laptop', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:02:20,921.921 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'hair', 'glasses', 'man', 'face', 'hand', 'woman', 'wall', '[UNK]', 'head', 'person', 'table', 'mouth', 'paper', 'arm', 'jean', 'ear', 'laptop', 'boy', 'nose', 'chair', 'computer', 'girl', 'keyboard', 'plate', 'window', 'board', 'button', 'knife', 'desk', 'cup', 'food', 'watch', 'book', 'floor', 'sleeve', 'collar', 'phone', 'door', 'light', 'glass', 'bottle', 'picture', 'finger', 'bowl', 'box', 'bracelet', 'ring', 'handle', 'napkin']
2022-03-16 06:02:36,912.912 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'face', 'book', 'hair', 'mouth', 'table', 'wall', 'arm', 'boy', 'eye', 'chair', 'paper', 'computer', 'watch', 'shirt', 'nose', 'ear', 'hat', 'wrist', 'glasses', 'mouse', 'sleeve', 'shelf', 'pad', 'laptop']
2022-03-16 06:05:00,413.413 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:38:48  iter: 2900  speed: 313.6 images/sec  total_norm: 128.8568 (130.7534)  loss: 167.5192 (169.3143)  masked_loss: 2.2662 (2.3142)  tag_loss: 165.2468 (167.0001)  time: 1.4342 (1.6329)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0050)  time_gpu: 1.4290 (1.6277)  lr: 0.000096  max mem: 26307
2022-03-16 06:05:00,775.775 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.42424243688583374
2022-03-16 06:05:00,776.776 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 137.3585205078125
2022-03-16 06:05:00,776.776 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.22449264526367
2022-03-16 06:05:04,227.227 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01450321264564991
2022-03-16 06:05:04,227.227 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:05:04,228.228 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'riding', 'a', 'skate', '##board', 'on', 'a', 'stone', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:05:04,243.243 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'stair', 'person', 'arm', 'man', 'leg', 'shirt', 'ground', 'tree', 'railing', 'sidewalk', 'building', 'hair', 'shoe', 'head', 'shadow', 'hand', 'pole', 'wall', 'wheel', 'boy', 'staircase', 'woman', 'step', 'window', 'board', 'foot', 'hat', 'sign', 'street', 'jacket', 'fence', 'photo', 'trunk', 'light', 'ramp', 'walkway', 'bridge', 'post', 'bench', 'bag', 'girl', 'trick', 'black', 'background', 'pillar', 'roof', 'park', 'door', 'platform']
2022-03-16 06:05:20,213.213 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'building', 'woman', 'ground', 'board', 'person', 'arm', 'boy', 'bridge', 'stone', 'window', 'shirt', 'leg', 'bag', 'wheel', 'column', 'hat', 'statue', 'jacket', 'bench', 'fence', 'fountain', 'sidewalk', 'ramp', 'pillar', 'stair']
2022-03-16 06:07:43,900.900 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:36:30  iter: 3000  speed: 313.2 images/sec  total_norm: 127.2282 (132.1906)  loss: 161.5379 (165.5154)  masked_loss: 2.1634 (2.2410)  tag_loss: 158.9387 (163.2745)  time: 1.4341 (1.6349)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0050)  time_gpu: 1.4291 (1.6297)  lr: 0.000095  max mem: 26307
2022-03-16 06:07:44,261.261 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4444444477558136
2022-03-16 06:07:44,261.261 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 180.79383850097656
2022-03-16 06:07:44,261.261 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.0646368457425
2022-03-16 06:07:47,760.760 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.014777246862649918
2022-03-16 06:07:47,760.760 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:07:47,760.760 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'cat', 'poking', '[MASK]', "'", 's', 'now', 'on', 'a', 'stuffed', 'toy', 'bird', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:07:47,776.776 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['eye', 'head', 'bird', 'window', 'tree', 'ear', 'cat', 'nose', 'beak', '[UNK]', 'tail', 'face', 'feather', 'toy', 'animal', 'leg', 'parrot', 'sky', 'wall', 'wing', 'paw', 'body', 'ledge', 'table', 'plant', 'mouth', 'leaf', 'cage', 'foot', 'curtain', 'duck', 'floor', 'glass', 'arm', 'button', 'green', 'frame', 'fur', 'chest', 'fence', 'trunk', 'rabbit', 'bush', 'collar', 'small', 'light', 'top', 'teddy', 'screen', 'bowl']
2022-03-16 06:08:03,738.738 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'eye', 'neck', 'window', 'wing', 'tree', 'sky', 'dog', 'ear', 'bird', 'cat', 'grass', 'tail', 'toy', 'collar', 'stuffed', 'paw', 'beak']
2022-03-16 06:10:27,447.447 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:34:11  iter: 3100  speed: 313.1 images/sec  total_norm: 126.7292 (129.8545)  loss: 163.0264 (165.4687)  masked_loss: 2.2443 (2.2429)  tag_loss: 160.7388 (163.2258)  time: 1.4351 (1.6355)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4297 (1.6302)  lr: 0.000095  max mem: 26307
2022-03-16 06:10:27,810.810 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 06:10:27,810.810 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 156.93826293945312
2022-03-16 06:10:27,811.811 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.15096294879913
2022-03-16 06:10:31,319.319 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.014928682707250118
2022-03-16 06:10:31,320.320 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:10:31,320.320 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'woman', 'with', 'large', ',', '[MASK]', 'kite', '[MASK]', 'close', 'to', 'the', 'ground', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:10:31,335.335 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['kite', 'sky', 'tree', 'string', 'tail', 'person', 'shirt', 'man', 'ground', 'woman', 'grass', '[UNK]', 'park', 'building', 'hair', 'shadow', 'head', 'jacket', 'car', 'hat', 'jean', 'short', 'boy', 'tent', 'field', 'child', 'ribbon', 'blue', 'hand', 'bush', 'girl', 'fence', 'leg', 'flag', 'cloud', 'beach', 'large', 'face', 'arm', 'bunch', 'rainbow', 'umbrella', 'group', 'couple', 'colorful', 'eye', 'bag', 'shoe', 'street', 'design']
2022-03-16 06:10:47,393.393 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'house', 'large', 'park', 'woman', 'short', 'ground', 'hair', 'girl', 'person', 'tree', 'sky', 'shirt', 'leg', 'camera', 'string', 'shadow', 'grass', 'tail', 'bush', 'hat', 'jacket', 'shoe', 'colorful', 'kite', 'sock']
03-16 06:12:21.311 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 06:12:21.312 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 06:12:22.579 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 06:13:10,997.997 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:31:50  iter: 3200  speed: 313.1 images/sec  total_norm: 127.8397 (133.1336)  loss: 166.6285 (165.9115)  masked_loss: 2.2942 (2.3085)  tag_loss: 164.0938 (163.6030)  time: 1.4334 (1.6355)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4281 (1.6302)  lr: 0.000095  max mem: 26307
2022-03-16 06:13:11,359.359 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-16 06:13:11,359.359 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.1209716796875
2022-03-16 06:13:11,359.359 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.17579731796727
2022-03-16 06:13:14,934.934 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.014963779598474503
2022-03-16 06:13:14,934.934 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:13:14,935.935 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'girl', 'sitting', 'down', ',', 'outside', '[MASK]', 'eating', 'a', 'sandwich', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:13:14,950.950 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'eye', 'girl', 'nose', 'hand', 'shirt', 'face', 'tree', 'man', 'head', 'mouth', 'bread', 'cake', 'food', '[UNK]', 'table', 'woman', 'finger', 'glasses', 'sandwich', 'window', 'chair', 'background', 'pillar', 'person', 'flower', 'ear', 'column', 'dress', 'plate', 'young', 'arm', 'sky', 'napkin', 'bang', 'building', 'ring', 'eyebrow', 'necklace', 'trunk', 'bush', 'child', 'little', 'top', 'jacket', 'pizza', 'grass', 'watch', 'pole', 'wall']
2022-03-16 06:13:30,960.960 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'hair', 'girl', 'outside', 'mouth', 'food', 'eye', 'chair', 'tree', 'jean', 'shirt', 'finger', 'nose', 'column', 'bread', 'glasses', 'eyebrow', 'cake', 'sandwich', 'necklace', 'pillar', 'strap']
2022-03-16 06:15:54,774.774 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:29:33  iter: 3300  speed: 312.6 images/sec  total_norm: 126.2624 (128.1804)  loss: 162.7510 (165.4645)  masked_loss: 2.3192 (2.2897)  tag_loss: 160.6864 (163.1749)  time: 1.4357 (1.6378)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0050)  time_gpu: 1.4307 (1.6326)  lr: 0.000095  max mem: 26307
2022-03-16 06:15:55,135.135 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.38235294818878174
2022-03-16 06:15:55,135.135 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 172.38565063476562
2022-03-16 06:15:55,135.135 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.2174820619471
2022-03-16 06:15:58,650.650 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.014947595074772835
2022-03-16 06:15:58,650.650 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:15:58,651.651 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'there', 'are', 'several', 'warship', 'players', 'practicing', 'for', '[MASK]', 'game', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:15:58,666.666 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['baseball', 'man', '[UNK]', 'shirt', 'bat', 'net', 'ball', 'hat', 'player', 'field', 'pole', 'stadium', 'cap', 'glove', 'shoe', 'person', 'head', 'sign', 'jersey', 'ground', 'hand', 'uniform', 'grass', 'leg', 'stand', 'logo', 'batter', 'dirt', 'seat', 'fence', 'wall', 'line', 'goal', 'shadow', 'number', 'mound', 'game', 'arm', 'light', 'umpire', 'helmet', 'tennis', 'netting', 'building', 'base', 'sky', 'pitcher', 'jacket', 'chair', 'catcher']
2022-03-16 06:16:14,681.681 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'several', 'game', 'player', 'ground', 'person', 'seat', 'arm', 'stadium', 'baseball', 'ball', 'sign', 'shirt', 'jersey', 'wheel', 'grass', 'net', 'hat', 'cap', 'pole', 'bat', 'shoe', 'tire', 'mat', 'glove', 'umpire']
2022-03-16 06:18:38,300.300 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:27:09  iter: 3400  speed: 313.1 images/sec  total_norm: 124.4796 (126.8232)  loss: 166.1431 (169.2912)  masked_loss: 2.2369 (2.2224)  tag_loss: 163.6011 (167.0687)  time: 1.4338 (1.6353)  data: 0.0001 (0.0005)  to_device: 0.0050 (0.0049)  time_gpu: 1.4289 (1.6299)  lr: 0.000095  max mem: 26307
2022-03-16 06:18:38,661.661 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.37142857909202576
2022-03-16 06:18:38,661.661 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 194.81141662597656
2022-03-16 06:18:38,662.662 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.13086558750697
2022-03-16 06:18:42,268.268 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015078512020409107
2022-03-16 06:18:42,268.268 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:18:42,269.269 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', 'green', '[MASK]', 'signs', 'sitting', 'on', 'top', 'of', 'a', 'pole', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:18:42,284.284 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'sign', 'cloud', 'pole', 'tree', 'letter', 'street', 'building', 'car', '[UNK]', 'road', 'line', 'window', 'sidewalk', 'light', 'arrow', 'bush', 'wire', 'stop', 'roof', 'number', 'word', 'traffic', 'power', 'house', 'tire', 'wall', 'green', 'truck', 'curb', 'grass', 'person', 'post', 'intersection', 'fence', 'van', 'can', 'blue', 'red', 'wheel', 'shadow', 'bridge', 'fire', 'way', 'bolt', 'side', 'background', 'corner', 'writing', 'parking']
2022-03-16 06:18:58,399.399 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['house', 'number', 'line', 'top', 'road', 'power', 'street', 'car', 'green', 'bridge', 'couple', 'tree', 'letter', 'sign', 'sky', 'circle', 'truck', 'grass', 'cloud', 'pole', 'sidewalk', 'curb']
2022-03-16 06:21:21,819.819 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:24:44  iter: 3500  speed: 313.1 images/sec  total_norm: 124.2238 (125.7086)  loss: 168.4714 (170.0881)  masked_loss: 2.1295 (2.1714)  tag_loss: 166.4771 (167.9167)  time: 1.4342 (1.6352)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4292 (1.6301)  lr: 0.000095  max mem: 26307
2022-03-16 06:21:22,180.180 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4838709533214569
2022-03-16 06:21:22,180.180 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 178.1074981689453
2022-03-16 06:21:22,181.181 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.00398275587294
2022-03-16 06:21:25,801.801 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015096554532647133
2022-03-16 06:21:25,802.802 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:21:25,802.802 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'riding', 'in', 'a', 'boat', '[MASK]', 'a', 'dog', 'on', 'his', 'lap', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:21:25,818.818 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'dog', 'hair', 'sunglasses', 'man', 'head', 'jacket', 'collar', 'water', 'short', 'sky', 'boat', 'leg', 'ear', 'face', 'nose', 'seat', 'can', 'window', 'pole', 'cup', 'shirt', '[UNK]', 'car', 'handle', 'arm', 'shadow', 'chair', 'mouth', 'tree', 'neck', 'foot', 'sleeve', 'glasses', 'bar', 'drink', 'finger', 'bench', 'door', 'vehicle', 'paw', 'beach', 'ocean', 'ring', 'wave', 'land', 'lid', 'beer', 'table', 'person']
2022-03-16 06:21:41,866.866 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'head', 'man', 'hand', 'face', 'water', 'cup', 'short', 'hair', 'seat', 'chair', 'bar', 'sky', 'shirt', 'dog', 'boat', 'coffee', 'leg', 'ear', 'lap', 'pole', 'jacket', 'collar', 'sunglasses']
2022-03-16 06:24:05,675.675 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:22:24  iter: 3600  speed: 312.5 images/sec  total_norm: 127.5525 (129.4285)  loss: 162.5210 (162.6636)  masked_loss: 2.2596 (2.2590)  tag_loss: 160.2764 (160.4045)  time: 1.4350 (1.6386)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4297 (1.6335)  lr: 0.000095  max mem: 26307
2022-03-16 06:24:06,036.036 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-16 06:24:06,037.037 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 165.65032958984375
2022-03-16 06:24:06,037.037 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 67.91705899625211
2022-03-16 06:24:09,676.676 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015121040865778923
2022-03-16 06:24:09,676.676 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:24:09,677.677 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'two', '[MASK]', 'are', 'proud', 'of', 'their', 'unusual', 'cake', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:24:09,692.692 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'hair', 'hand', 'man', 'wall', 'woman', 'head', '[UNK]', 'cake', 'arm', 'picture', 'table', 'face', 'pizza', 'mouth', 'ear', 'crab', 'person', 'jacket', 'glass', 'girl', 'plate', 'nose', 'eye', 'leg', 'couple', 'knife', 'jean', 'food', 'glasses', 'ceiling', 'floor', 'sign', 'window', 'short', 'door', 'bottle', 'board', 'light', 'finger', 'chair', 'lady', 'poster', 'flower', 'bag', 'sweater', 'tray', 'box', 'apron', 'cup']
2022-03-16 06:24:25,750.750 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'woman', 'hair', 'person', 'table', 'wall', 'food', 'couple', 'shirt', 'picture', 'leg', 'dress', 'nose', 'ear', 'knife', 'unusual', 'proud', 'cake', 'badge', 'tray', 'necklace', 'candle', 'crab']
2022-03-16 06:26:49,556.556 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:20:03  iter: 3700  speed: 312.4 images/sec  total_norm: 124.1741 (129.3139)  loss: 162.8971 (164.3228)  masked_loss: 2.0609 (2.0956)  tag_loss: 160.4019 (162.2273)  time: 1.4354 (1.6388)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4303 (1.6336)  lr: 0.000094  max mem: 26307
2022-03-16 06:26:49,916.916 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4375
2022-03-16 06:26:49,917.917 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 142.99635314941406
2022-03-16 06:26:49,917.917 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.03635366339432
2022-03-16 06:26:53,600.600 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015165851451456547
2022-03-16 06:26:53,601.601 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:26:53,601.601 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'surfing', 'on', 'a', 'board', 'in', 'the', 'water', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:26:53,616.616 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'water', 'mountain', 'tree', 'hill', 'wave', '[UNK]', 'beach', 'boat', 'head', 'person', 'sand', 'leg', 'hair', 'man', 'cloud', 'shirt', 'arm', 'hand', 'board', 'shore', 'rock', 'reflection', 'ocean', 'ear', 'short', 'ground', 'background', 'foot', 'woman', 'face', 'house', 'building', 'jacket', 'boy', 'dog', 'grass', 'hat', 'pole', 'tail', 'lake', 'girl', 'umbrella', 'body', 'child', 'rope', 'forest', 'bird', 'large', 'shoe']
2022-03-16 06:27:09,623.623 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'water', 'short', 'board', 'hair', 'arm', 'hill', 'mountain', 'tree', 'sky', 'shirt', 'leg', 'wave']
2022-03-16 06:29:33,461.461 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:17:41  iter: 3800  speed: 312.4 images/sec  total_norm: 127.1361 (129.9609)  loss: 166.2564 (168.0676)  masked_loss: 2.0833 (2.1578)  tag_loss: 163.2929 (165.9098)  time: 1.4368 (1.6390)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4314 (1.6339)  lr: 0.000094  max mem: 26307
2022-03-16 06:29:33,823.823 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-16 06:29:33,824.824 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 171.9650115966797
2022-03-16 06:29:33,824.824 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.0916255070613
2022-03-16 06:29:37,532.532 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015115528367459774
2022-03-16 06:29:37,532.532 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:29:37,532.532 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'with', 'a', 'tennis', 'ball', 'in', 'one', 'hand', 'and', '[MASK]', 'tennis', 'rack', '##et', 'in', 'the', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:29:37,548.548 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'shirt', '[UNK]', 'man', 'tennis', 'short', 'arm', 'court', 'head', 'leg', 'wall', 'hair', 'ball', 'face', 'band', 'ear', 'nose', 'logo', 'handle', 'wrist', 'mouth', 'shoe', 'sock', 'ground', 'player', 'line', 'eye', 'sleeve', 'letter', 'cap', 'hat', 'watch', 'stripe', 'string', 'bracelet', 'fence', 'collar', 'beard', 'finger', 'male', 'shadow', 'chair', 'person', 'sign', 'glasses', 'knee', 'necklace', 'stand', 'background', 'woman']
2022-03-16 06:29:53,533.533 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'other', 'head', 'man', 'hand', 'face', 'band', 'court', 'short', 'hair', 'mouth', 'wall', 'arm', 'eye', 'watch', 'ball', 'letter', 'shirt', 'background', 'nose', 'handle', 'tennis', 'net', 'wrist', 'logo', 'beard', 'sleeve', 'curtain', 'stripe']
2022-03-16 06:32:17,360.360 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:15:18  iter: 3900  speed: 312.4 images/sec  total_norm: 127.2599 (131.2631)  loss: 167.6900 (166.4754)  masked_loss: 2.0399 (2.1247)  tag_loss: 165.4315 (164.3507)  time: 1.4353 (1.6390)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4301 (1.6339)  lr: 0.000094  max mem: 26307
2022-03-16 06:32:17,722.722 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5277777910232544
2022-03-16 06:32:17,722.722 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 177.50115966796875
2022-03-16 06:32:17,723.723 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.04300327301026
2022-03-16 06:32:21,453.453 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015265017747879028
2022-03-16 06:32:21,453.453 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:32:21,453.453 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'black', 'and', 'white', 'flaps', 'a', 'small', 'bathroom', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:32:21,469.469 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'bathroom', 'mirror', 'toilet', 'sink', 'floor', '[UNK]', 'outlet', 'paper', 'pipe', 'lid', 'seat', 'tank', 'bottle', 'door', 'light', 'trash', 'towel', 'bowl', 'shadow', 'sign', 'handle', 'bag', 'can', 'switch', 'reflection', 'tissue', 'holder', 'man', 'tile', 'soap', 'curtain', 'box', 'ceiling', 'shelf', 'person', 'basket', 'cup', 'shower', 'white', 'picture', 'roll', 'drain', 'window', 'head', 'hand', 'base', 'stall', 'small', 'hair']
2022-03-16 06:32:37,505.505 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'small', 'black', 'white', 'light', 'floor', 'wall', 'seat', 'bar', 'box', 'tank', 'mirror', 'bathroom', 'switch', 'sink', 'tissue', 'pipe', 'reflection', 'dish', 'towel', 'curtain', 'shelf', 'toilet', 'lid', 'outlet']
2022-03-16 06:35:01,423.423 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:12:56  iter: 4000  speed: 312.1 images/sec  total_norm: 127.7047 (130.4037)  loss: 161.2848 (163.2589)  masked_loss: 2.0475 (2.1304)  tag_loss: 158.8457 (161.1286)  time: 1.4358 (1.6406)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4306 (1.6354)  lr: 0.000094  max mem: 26307
2022-03-16 06:35:01,785.785 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-16 06:35:01,786.786 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 181.9211883544922
2022-03-16 06:35:01,786.786 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.03520258461556
2022-03-16 06:35:05,562.562 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015245223417878151
2022-03-16 06:35:05,562.562 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:35:05,563.563 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'zebra', '[MASK]', 'nu', '##zzle', 'each', 'other', 'while', 'another', 'zebra', 'stands', 'in', 'the', 'background', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:35:05,578.578 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['leg', 'zebra', 'ear', 'ground', 'shadow', 'head', 'mane', 'grass', '[UNK]', 'nose', 'eye', 'tree', 'stripe', 'tail', 'field', 'dirt', 'face', 'mouth', 'branch', 'rock', 'other', 'hair', 'trunk', 'body', 'bush', 'neck', 'background', 'foot', 'baby', 'next', 'area', 'leaf', 'standing', 'road', 'back', 'grassy', 'couple', 'adult', 'group', 'dry', 'small', 'snout', 'stick', 'herd', 'spot', 'mother', 'white', 'sand', 'surface', 'black']
2022-03-16 06:35:21,595.595 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'other', 'head', 'face', 'field', 'ground', 'rock', 'eye', 'tree', 'leg', 'background', 'nose', 'ear', 'shadow', 'grass', 'tail', 'stripe', 'mane', 'zebra']
2022-03-16 06:37:45,517.517 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:10:34  iter: 4100  speed: 312.0 images/sec  total_norm: 128.1899 (130.5380)  loss: 169.8493 (168.4372)  masked_loss: 2.1101 (2.1154)  tag_loss: 166.4966 (166.3218)  time: 1.4340 (1.6409)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4287 (1.6357)  lr: 0.000094  max mem: 26307
2022-03-16 06:37:45,875.875 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.3529411852359772
2022-03-16 06:37:45,875.875 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 177.58633422851562
2022-03-16 06:37:45,876.876 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.04508300054641
2022-03-16 06:37:49,695.695 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015245894901454449
2022-03-16 06:37:49,695.695 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:37:49,695.695 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'herd', 'of', 'sheep', 'walking', 'along', 'a', 'lush', '[MASK]', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:37:49,711.711 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'sky', 'grass', 'sheep', 'field', 'fence', 'herd', 'green', 'animal', 'grassy', 'pasture', 'leg', '[UNK]', 'head', 'pole', 'bunch', 'lush', 'trunk', 'wood', 'post', 'hill', 'building', 'flock', 'leaf', 'large', 'background', 'grazing', 'house', 'bush', 'cow', 'group', 'cloud', 'road', 'mountain', 'open', 'forest', 'wool', 'person', 'wire', 'roof', 'white', 'lamb', 'rock', 'area', 'big', 'distance', 'middle', 'day', 'tail', 'goat']
2022-03-16 06:38:05,683.683 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['field', 'green', 'hill', 'tree', 'sky', 'walking', 'grass', 'bush', 'cloud', 'pole', 'trunk', 'sheep', 'herd', 'lush']
2022-03-16 06:40:29,468.468 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:08:08  iter: 4200  speed: 312.3 images/sec  total_norm: 125.5093 (128.2729)  loss: 163.6538 (164.5845)  masked_loss: 2.0316 (2.1320)  tag_loss: 162.0480 (162.4525)  time: 1.4346 (1.6395)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4295 (1.6345)  lr: 0.000094  max mem: 26307
2022-03-16 06:40:29,829.829 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5151515007019043
2022-03-16 06:40:29,829.829 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 168.20590209960938
2022-03-16 06:40:29,829.829 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.05043224955715
2022-03-16 06:40:33,683.683 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015223406255245209
2022-03-16 06:40:33,683.683 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:40:33,683.683 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'black', 'truck', 'parked', 'on', 'the', 'curb', 'with', '[MASK]', 'sign', 'beside', '也', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:40:33,699.699 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'tree', 'sign', 'windshield', 'tire', 'window', 'sky', 'building', 'plate', 'roof', 'pole', 'person', 'ground', 'license', 'truck', 'car', 'grill', 'light', 'road', 'man', 'mirror', 'street', 'bumper', 'hood', 'jacket', 'shirt', 'front', 'writing', 'woman', 'wheel', 'sidewalk', 'house', 'bus', 'hat', 'wall', 'door', 'bag', 'stop', 'bush', 'jean', 'van', 'shoe', 'coat', 'chimney', 'next', 'fence', 'logo', 'parking', 'lot', 'child']
2022-03-16 06:40:49,639.639 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'house', 'black', 'building', 'road', 'light', 'ground', 'board', 'person', 'wall', 'writing', 'window', 'tree', 'sign', 'sky', 'shirt', 'roof', 'bag', 'truck', 'plate', 'wheel', 'mirror', 'brick', 'license', 'pole', 'hood', 'tire', 'curb', 'grill', 'windshield']
03-16 06:42:22.581 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 06:42:22.581 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 06:42:23.638 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 06:43:13,354.354 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:05:41  iter: 4300  speed: 312.4 images/sec  total_norm: 124.6680 (127.3306)  loss: 164.0845 (166.2099)  masked_loss: 2.1034 (2.1543)  tag_loss: 162.1899 (164.0556)  time: 1.4347 (1.6389)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0047)  time_gpu: 1.4297 (1.6340)  lr: 0.000094  max mem: 26307
2022-03-16 06:43:13,717.717 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6388888955116272
2022-03-16 06:43:13,717.717 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 165.03314208984375
2022-03-16 06:43:13,717.717 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.14587593078613
2022-03-16 06:43:17,609.609 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01524761039763689
2022-03-16 06:43:17,609.609 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:43:17,610.610 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'teddy', '[MASK]', 'sitting', 'at', 'a', 'table', '[MASK]', 'drinking', '[MASK]', 'on', 'it', 'with', 'more', 'teddy', 'bears', 'in', 'the', 'background', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:43:17,625.625 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bear', 'chair', 'table', 'floor', 'teddy', 'animal', 'glass', 'bow', 'stuffed', 'ribbon', 'head', 'leg', 'tile', 'person', 'ball', '[UNK]', 'foot', 'pole', 'bottle', 'dog', 'shoe', 'hat', 'shirt', 'ear', 'monkey', 'tag', 'toy', 'basket', 'stool', 'nose', 'doll', 'room', 'window', 'paw', 'arm', 'store', 'scarf', 'paper', 'bag', 'cup', 'man', 'woman', 'cushion', 'sign', 'group', 'ground', 'display', 'bar', 'sweater', 'wall']
2022-03-16 06:43:33,619.619 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'floor', 'table', 'glass', 'chair', 'animal', 'leg', 'background', 'bear', 'tail', 'bottle', 'drinking', 'hat', 'statue', 'bow', 'lighter', 'ribbon', 'teddy', 'stuffed', 'bucket']
2022-03-16 06:45:57,540.540 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:03:18  iter: 4400  speed: 311.8 images/sec  total_norm: 123.7966 (126.1624)  loss: 166.1933 (169.4271)  masked_loss: 2.1666 (2.2050)  tag_loss: 164.4248 (167.2221)  time: 1.4364 (1.6419)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4314 (1.6369)  lr: 0.000093  max mem: 26307
2022-03-16 06:45:57,902.902 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-16 06:45:57,902.902 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.14837646484375
2022-03-16 06:45:57,902.902 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.210033331977
2022-03-16 06:46:01,826.826 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015206034295260906
2022-03-16 06:46:01,826.826 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:46:01,827.827 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'floating', 'along', 'a', 'shore', 'line', 'with', '[MASK]', 'of', 'cranes', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:46:01,842.842 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['flag', 'sky', 'water', 'boat', 'crane', 'dock', 'window', 'cloud', '[UNK]', 'pole', 'harbor', 'person', 'building', 'bridge', 'man', 'tire', 'cabin', 'pier', 'sign', 'light', 'door', 'ship', 'box', 'rope', 'truck', 'large', 'street', 'tower', 'river', 'wall', 'car', 'stripe', 'mast', 'number', 'life', 'cone', 'wheel', 'roof', 'american', 'shirt', 'small', 'tree', 'container', 'structure', 'top', 'post', 'red', 'stair', 'name', 'white']
2022-03-16 06:46:17,884.884 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'american', 'life', 'line', 'water', 'building', 'river', 'door', 'light', 'fire', 'writing', 'window', 'letter', 'sky', 'boat', 'flag', 'shore', 'cloud', 'pole', 'rope', 'dock', 'crane', 'stripe']
2022-03-16 06:48:41,557.557 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:00:51  iter: 4500  speed: 312.2 images/sec  total_norm: 124.5511 (126.5793)  loss: 162.7146 (162.9010)  masked_loss: 2.1206 (2.0979)  tag_loss: 160.3948 (160.8031)  time: 1.4348 (1.6401)  data: 0.0001 (0.0005)  to_device: 0.0049 (0.0048)  time_gpu: 1.4298 (1.6348)  lr: 0.000093  max mem: 26307
2022-03-16 06:48:41,917.917 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 06:48:41,918.918 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 181.21803283691406
2022-03-16 06:48:41,918.918 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.16388685806938
2022-03-16 06:48:45,878.878 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01532444916665554
2022-03-16 06:48:45,878.878 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:48:45,878.878 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'tow', 'lady', "'", 's', 'enjoying', 'a', 'chocolate', '[MASK]', 'and', 'some', 'coffee', '##ע', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:48:45,894.894 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', 'necklace', 'woman', 'plate', '[UNK]', 'sweater', 'shirt', 'hair', 'wall', 'hand', 'fork', 'knife', 'handle', 'cup', 'window', 'coffee', 'cake', 'face', 'chair', 'spoon', 'food', 'neck', 'nose', 'mug', 'lid', 'restaurant', 'napkin', 'glasses', 'bowl', 'glass', 'person', 'pot', 'head', 'kettle', 'ear', 'girl', 'bottle', 'pitcher', 'mouth', 'flower', 'fireplace', 'plant', 'fruit', 'eye', 'tea', 'candle', 'belt', 'cabinet', 'door', 'dessert']
2022-03-16 06:49:01,979.979 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'face', 'woman', 'cup', 'hair', 'girl', 'person', 'table', 'wall', 'lady', 'chair', 'plant', 'shirt', 'kitchen', 'coffee', 'bowl', 'handle', 'plate', 'cabinet', 'knife', 'bottle', 'fruit', 'liquid', 'sink', 'glasses', 'chocolate', 'purse', 'fork', 'cake', 'basket', 'necklace', 'drawer', 'sweater', 'mug', 'soda', 'banana', 'spoon', 'tow', 'dessert', 'napkin']
2022-03-16 06:51:25,781.781 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:58:26  iter: 4600  speed: 311.8 images/sec  total_norm: 124.8762 (126.4941)  loss: 158.2184 (160.1100)  masked_loss: 2.0506 (2.1005)  tag_loss: 156.2362 (158.0095)  time: 1.4351 (1.6423)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4298 (1.6371)  lr: 0.000093  max mem: 26307
2022-03-16 06:51:26,141.141 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7777777910232544
2022-03-16 06:51:26,141.141 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.08828735351562
2022-03-16 06:51:26,141.141 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.12265615260347
2022-03-16 06:51:30,159.159 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015313643962144852
2022-03-16 06:51:30,160.160 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:51:30,160.160 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'wire', 'racks', 'filled', '[MASK]', 'don', '[MASK]', 'and', 'don', '##ut', 'holes', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:51:30,175.175 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'rack', 'hole', 'tray', 'oven', 'metal', 'table', 'different', 'reflection', 'box', 'apple', 'shelf', 'chocolate', 'pastry', 'tomato', 'wall', 'candy', 'light', 'top', 'grill', 'large', 'bunch', 'counter', 'cookie', 'display', 'pan', 'ball', 'food', 'other', 'sign', 'paper', 'variety', 'plastic', 'many', 'stem', 'label', 'bottom', 'bread', 'machine', 'open', 'baking', 'various', 'bar', 'cake', 'container', 'wire', 'orange', 'hand', 'hot', 'glazed']
2022-03-16 06:51:46,143.143 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'orange', 'hole', 'apple', 'wire', 'shelf', 'tray', 'rack']
2022-03-16 06:54:09,913.913 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:55:59  iter: 4700  speed: 311.9 images/sec  total_norm: 126.5066 (130.3545)  loss: 163.5384 (163.6275)  masked_loss: 2.0340 (2.0479)  tag_loss: 162.0044 (161.5796)  time: 1.4340 (1.6414)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4289 (1.6363)  lr: 0.000093  max mem: 26307
2022-03-16 06:54:10,276.276 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-16 06:54:10,276.276 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 161.82684326171875
2022-03-16 06:54:10,276.276 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.18111085891724
2022-03-16 06:54:14,288.288 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015371406450867653
2022-03-16 06:54:14,288.288 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:54:14,288.288 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'boy', 'with', 'his', 'baseball', 'mit', '##t', 'and', 'ball', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:54:14,303.303 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'nose', 'face', 'boy', 'eye', 'lip', 'hair', 'head', 'mouth', 'person', 'hand', 'glove', 'girl', '[UNK]', 'ball', 'collar', 'baseball', 'ear', 'finger', 'woman', 'glasses', 'strap', 'man', 'eyebrow', 'arm', 'hat', 'tree', 'logo', 'cap', 'building', 'sunglasses', 'neck', 'button', 'letter', 'sleeve', 'wall', 'chin', 'bat', 'window', 'child', 'handle', 'young', 'jacket', 'stripe', 'thumb', 'sky', 'zipper', 'pole', 'background', 'fence']
2022-03-16 06:54:30,276.276 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'name', 'face', 'building', 'hair', 'mouth', 'person', 'boy', 'writing', 'eye', 'window', 'baseball', 'ball', 'letter', 'shirt', 'nose', 'ear', 'hole', 'lip', 'hat', 'cap', 'eyebrow', 'glove', 'strap']
2022-03-16 06:56:54,054.054 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:53:31  iter: 4800  speed: 311.9 images/sec  total_norm: 126.9829 (127.8083)  loss: 163.8952 (164.3496)  masked_loss: 2.1667 (2.1435)  tag_loss: 161.8268 (162.2061)  time: 1.4342 (1.6414)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4289 (1.6362)  lr: 0.000093  max mem: 26307
2022-03-16 06:56:54,415.415 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4545454680919647
2022-03-16 06:56:54,415.415 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 179.7439422607422
2022-03-16 06:56:54,415.415 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.14169918760962
2022-03-16 06:56:58,507.507 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015463724732398987
2022-03-16 06:56:58,507.507 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:56:58,507.507 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'buses', '[MASK]', 'lined', 'up', 'waiting', '[MASK]', 'passengers', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:56:58,522.522 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bus', 'sky', 'window', 'light', 'pole', 'windshield', '[UNK]', 'street', 'plate', 'sign', 'road', 'license', 'number', 'tire', 'building', 'mirror', 'door', 'letter', 'wheel', 'word', 'person', 'car', 'front', 'man', 'tree', 'driver', 'line', 'roof', 'cloud', 'shirt', 'sidewalk', 'bumper', 'curb', 'logo', 'stop', 'top', 'fence', 'decker', 'red', 'woman', 'traffic', 'lot', 'steering', 'double', 'next', 'advertisement', 'reflection', 'grass', 'van', 'hood']
2022-03-16 06:57:14,567.567 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'number', 'line', 'door', 'road', 'street', 'light', 'window', 'tree', 'letter', 'sky', 'bus', 'plate', 'wheel', 'mirror', 'license', 'pole', 'tire', 'antenna', 'windshield']
2022-03-16 06:59:38,087.087 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:51:02  iter: 4900  speed: 312.1 images/sec  total_norm: 128.6028 (130.9083)  loss: 161.6124 (162.2776)  masked_loss: 2.0340 (2.0938)  tag_loss: 159.4704 (160.1839)  time: 1.4327 (1.6403)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4276 (1.6353)  lr: 0.000093  max mem: 26307
2022-03-16 06:59:38,450.450 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-16 06:59:38,450.450 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 119.79084777832031
2022-03-16 06:59:38,450.450 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.27909439086915
2022-03-16 06:59:42,684.684 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015442762523889542
2022-03-16 06:59:42,685.685 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 06:59:42,685.685 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'pink', 'plastic', 'tray', 'has', 'food', '[MASK]', 'compartments', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 06:59:42,700.700 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['carrot', 'container', '[UNK]', 'food', 'lid', 'table', 'bowl', 'vegetable', 'box', 'plastic', 'potato', 'bean', 'tray', 'bread', 'cheese', 'sauce', 'tomato', 'grape', 'fruit', 'fork', 'meat', 'rice', 'plate', 'sausage', 'egg', 'dish', 'candy', 'mushroom', 'different', 'nut', 'lemon', 'stem', 'cookie', 'cup', 'handle', 'spoon', 'slice', 'corn', 'onion', 'orange', 'top', 'bunch', 'pea', 'pepper', 'close', 'lunch', 'full', 'green', 'dog', 'butter']
2022-03-16 06:59:58,706.706 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'food', 'pink', 'fruit', 'plastic', 'apple', 'pile', 'candy', 'container', 'tray', 'lid', 'lime', 'lemon', 'potato', 'grape', 'vegetable', 'nut', 'tomato', 'onion', 'carrot']
2022-03-16 07:02:22,301.301 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:48:34  iter: 5000  speed: 311.8 images/sec  total_norm: 126.9490 (130.2349)  loss: 160.4424 (162.6607)  masked_loss: 2.0719 (2.0567)  tag_loss: 158.1428 (160.6040)  time: 1.4343 (1.6421)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0047)  time_gpu: 1.4293 (1.6372)  lr: 0.000092  max mem: 26307
2022-03-16 07:02:22,303.303 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt
2022-03-16 07:03:36,062.062 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.39393940567970276
2022-03-16 07:03:36,062.062 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 156.119140625
2022-03-16 07:03:36,062.062 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.31408407173905
2022-03-16 07:03:40,214.214 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015399402938783169
2022-03-16 07:03:40,214.214 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:03:40,215.215 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', 'cake', 'left', 'on', 'a', 'plate', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:03:40,231.231 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cake', 'plate', 'table', 'knife', 'handle', '[UNK]', 'blade', 'fork', 'piece', 'chocolate', 'layer', 'shadow', 'slice', 'spoon', 'dessert', 'napkin', 'board', 'white', 'sauce', 'top', 'paper', 'pie', 'cardboard', 'leaf', 'design', 'person', 'whipped', 'cup', 'food', 'box', 'desert', 'crust', 'next', 'container', 'glass', 'wall', 'object', 'bottle', 'cutting', 'tray', 'cream', 'reflection', 'light', 'candle', 'screw', 'floor', 'cloth', 'stem', 'different', 'hole']
2022-03-16 07:03:56,304.304 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'table', 'piece', 'hole', 'handle', 'plate', 'knife', 'blade', 'cake', 'pizza', 'napkin', 'shovel']
2022-03-16 07:06:19,340.340 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:00:42  iter: 5100  speed: 216.0 images/sec  total_norm: 128.8725 (130.2941)  loss: 159.0382 (159.6953)  masked_loss: 2.0793 (2.0698)  tag_loss: 156.9288 (157.6255)  time: 1.4340 (2.3704)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4290 (1.6315)  save_time: 73.3883 (73.3883)  lr: 0.000092  max mem: 26307
2022-03-16 07:06:19,701.701 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5714285969734192
2022-03-16 07:06:19,702.702 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 160.65884399414062
2022-03-16 07:06:19,702.702 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.29552195622371
2022-03-16 07:06:23,907.907 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015500053763389587
2022-03-16 07:06:23,907.907 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:06:23,908.908 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'clock', 'set', 'on', '[MASK]', 'of', 'a', 'rhino', '##cer', '##os', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:06:23,923.923 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['clock', 'number', 'wall', 'hand', 'face', '[UNK]', 'statue', 'eagle', 'design', 'lion', 'large', 'painting', 'picture', 'bird', 'roman', 'head', 'leaf', 'wing', 'sun', 'hour', 'sword', 'frame', 'building', 'base', 'leg', 'door', 'name', 'gold', 'side', 'handle', 'top', 'old', 'decoration', 'front', 'tree', 'window', 'table', 'big', 'foot', 'background', 'column', 'minute', 'crown', 'emblem', 'word', 'white', 'ornate', 'wood', 'man', 'floor']
2022-03-16 07:06:39,922.922 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'number', 'face', 'top', 'wall', 'base', 'eye', 'foot', 'window', 'leg', 'nose', 'ear', 'sword', 'clock', 'statue', 'bull', 'umbrella']
2022-03-16 07:09:03,498.498 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:57:54  iter: 5200  speed: 311.9 images/sec  total_norm: 125.0973 (126.6906)  loss: 168.7177 (167.8608)  masked_loss: 2.0960 (2.1171)  tag_loss: 166.5221 (165.7437)  time: 1.4337 (1.6416)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4285 (1.6364)  save_time: 73.3883 (73.3883)  lr: 0.000092  max mem: 26307
2022-03-16 07:09:03,860.860 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-16 07:09:03,860.860 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 147.057861328125
2022-03-16 07:09:03,860.860 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.30077779517984
2022-03-16 07:09:08,081.081 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015462061390280724
2022-03-16 07:09:08,081.081 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:09:08,081.081 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', '[MASK]', 'wine', 'into', 'two', 'other', 'men', '##s', 'wine', 'glasses', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:09:08,096.096 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wine', 'glass', 'man', 'hand', 'bottle', 'hair', 'head', 'jacket', 'table', 'shirt', 'button', 'glasses', 'paper', 'woman', 'jean', 'face', '[UNK]', 'ear', 'suit', 'wall', 'arm', 'coat', 'napkin', 'bucket', 'person', 'nose', 'ceiling', 'bowl', 'light', 'label', 'bar', 'watch', 'pot', 'cup', 'window', 'sweater', 'door', 'hat', 'menu', 'sign', 'cap', 'container', 'shelf', 'picture', 'eye', 'pitcher', 'group', 'mouth', 'chair', 'sleeve']
2022-03-16 07:09:24,027.027 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'other', 'head', 'man', 'name', 'hand', 'face', 'woman', 'cup', 'hair', 'person', 'arm', 'glass', 'ring', 'sign', 'jean', 'newspaper', 'shirt', 'wine', 'bag', 'ear', 'bowl', 'suit', 'tie', 'bottle', 'tag', 'button', 'jacket', 'pen', 'glasses', 'logo', 'barrel', 'purse', 'collar', 'sleeve', 'container', 'bucket']
2022-03-16 07:11:47,667.667 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:55:06  iter: 5300  speed: 311.9 images/sec  total_norm: 127.1204 (130.1645)  loss: 165.2025 (164.0027)  masked_loss: 1.9677 (2.0347)  tag_loss: 163.4390 (161.9680)  time: 1.4347 (1.6417)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4297 (1.6367)  save_time: 73.3883 (73.3883)  lr: 0.000092  max mem: 26307
2022-03-16 07:11:48,032.032 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-16 07:11:48,032.032 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 137.82122802734375
2022-03-16 07:11:48,032.032 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.40083567301433
2022-03-16 07:11:52,325.325 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015472730621695518
2022-03-16 07:11:52,326.326 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:11:52,326.326 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'young', 'boy', 'abby', 'a', '[MASK]', 'on', 'smiling', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:11:52,342.342 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'boy', 'eye', 'ear', 'hair', 'tie', 'nose', 'wall', 'head', 'face', 'neck', 'collar', 'lip', 'mouth', 'eyebrow', 'knot', 'button', 'teeth', 'door', 'young', 'picture', 'background', '[UNK]', 'shadow', 'smile', 'object', 'shoulder', 'child', 'black', 'stripe', 'front', 'chair', 'room', 'man', 'pocket', 'light', 'blue', 'forehead', 'window', 'chin', 'cheek', 'person', 'little', 'handle', 'shelf', 'frame', 'dress', 'arm', 'table', 'logo']
2022-03-16 07:12:08,487.487 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'face', 'door', 'young', 'hair', 'mouth', 'wall', 'smile', 'boy', 'eye', 'neck', 'shirt', 'nose', 'ear', 'object', 'lip', 'tie', 'collar', 'eyebrow', 'knot']
03-16 07:12:23.737 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 07:12:23.737 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 07:12:24.790 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 07:14:31,863.863 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:52:19  iter: 5400  speed: 311.8 images/sec  total_norm: 127.9301 (130.1617)  loss: 165.5096 (165.5691)  masked_loss: 2.0282 (2.0227)  tag_loss: 163.1503 (163.5464)  time: 1.4335 (1.6420)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4284 (1.6368)  save_time: 73.3883 (73.3883)  lr: 0.000092  max mem: 26307
2022-03-16 07:14:32,225.225 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-16 07:14:32,225.225 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 136.86669921875
2022-03-16 07:14:32,226.226 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.37238422740589
2022-03-16 07:14:36,550.550 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0154603635892272
2022-03-16 07:14:36,550.550 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:14:36,551.551 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', 'elephant', '[MASK]', 'a', 'shovel', 'with', 'its', 'trunk', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:14:36,566.566 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['elephant', 'ground', 'shadow', 'person', 'leg', 'trunk', 'tree', 'ear', 'foot', '[UNK]', 'head', 'man', 'grass', 'shirt', 'fence', 'sky', 'eye', 'pole', 'hat', 'chain', 'sign', 'jacket', 'dirt', 'hose', 'tail', 'stick', 'flag', 'rope', 'hair', 'truck', 'bench', 'hand', 'tire', 'woman', 'crowd', 'cane', 'roof', 'umbrella', 'shoe', 'chair', 'bell', 'seat', 'vehicle', 'jean', 'post', 'toe', 'stand', 'large', 'wheel', 'trailer']
2022-03-16 07:14:52,627.627 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'building', 'ground', 'post', 'person', 'eye', 'foot', 'tree', 'box', 'sign', 'sky', 'block', 'shirt', 'leg', 'roof', 'ear', 'shadow', 'flag', 'grass', 'trunk', 'fence', 'banner', 'elephant', 'saddle', 'paddle', 'shovel']
2022-03-16 07:17:16,361.361 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:49:35  iter: 5500  speed: 311.3 images/sec  total_norm: 126.0490 (127.5789)  loss: 164.1419 (164.4758)  masked_loss: 2.1087 (2.0897)  tag_loss: 162.0293 (162.3861)  time: 1.4336 (1.6450)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4284 (1.6399)  save_time: 73.3883 (73.3883)  lr: 0.000092  max mem: 26307
2022-03-16 07:17:16,724.724 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.3055555522441864
2022-03-16 07:17:16,724.724 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 156.3013916015625
2022-03-16 07:17:16,725.725 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.40960775102887
2022-03-16 07:17:21,090.090 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015497853979468346
2022-03-16 07:17:21,090.090 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:17:21,091.091 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', 'orange', 'flower', 'resting', 'in', '[MASK]', 'oddly', 'shaped', 'vase', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:17:21,106.106 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['flower', 'vase', 'stem', 'leaf', '[UNK]', 'table', 'background', 'glass', 'wall', 'handle', 'water', 'rose', 'plant', 'reflection', 'white', 'shadow', 'scissors', 'paper', 'clear', 'blade', 'bottom', 'base', 'red', 'top', 'item', 'light', 'rim', 'orange', 'line', 'small', 'design', 'pink', 'next', 'couple', 'shelf', 'ground', 'yellow', 'mirror', 'purple', 'blue', 'bouquet', 'green', 'hole', 'pair', 'colorful', 'bud', 'branch', 'frame', 'object', 'sky']
2022-03-16 07:17:37,066.066 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'table', 'rose', 'paper', 'background', 'orange', 'shaped', 'blade', 'flower', 'leaf', 'stem', 'vase']
2022-03-16 07:20:00,665.665 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:46:49  iter: 5600  speed: 311.6 images/sec  total_norm: 125.3712 (127.0534)  loss: 161.7146 (163.6505)  masked_loss: 2.0062 (2.0626)  tag_loss: 160.0377 (161.5879)  time: 1.4329 (1.6430)  data: 0.0001 (0.0005)  to_device: 0.0049 (0.0047)  time_gpu: 1.4279 (1.6378)  save_time: 73.3883 (73.3883)  lr: 0.000092  max mem: 26307
2022-03-16 07:20:01,026.026 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-16 07:20:01,026.026 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 161.0405731201172
2022-03-16 07:20:01,027.027 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.44808437949733
2022-03-16 07:20:05,419.419 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015514500439167023
2022-03-16 07:20:05,419.419 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:20:05,420.420 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'sitting', 'in', 'front', 'of', 'a', '[MASK]', 'sitting', 'on', 'top', 'of', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:20:05,435.435 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'ring', 'woman', 'table', 'scarf', 'plate', 'finger', 'nose', 'shirt', 'glass', 'hair', '[UNK]', 'head', 'face', 'mouth', 'eye', 'food', 'arm', 'knife', 'wall', 'fork', 'cup', 'ear', 'person', 'neck', 'chair', 'bowl', 'glasses', 'napkin', 'cake', 'handle', 'dress', 'water', 'bread', 'bracelet', 'window', 'bottle', 'spoon', 'sandwich', 'wrist', 'girl', 'watch', 'meal', 'lid', 'pizza', 'top', 'man', 'paper', 'sleeve', 'wine']
2022-03-16 07:20:21,395.395 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['hand', 'face', 'top', 'front', 'woman', 'hair', 'girl', 'mouth', 'table', 'eye', 'chair', 'ring', 'finger', 'nose', 'handle', 'plate', 'lip', 'knife', 'fork', 'eyebrow', 'sleeve', 'pizza', 'slice', 'scarf']
2022-03-16 07:22:44,945.945 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:44:02  iter: 5700  speed: 311.7 images/sec  total_norm: 128.5414 (130.6068)  loss: 161.8747 (162.9042)  masked_loss: 2.0387 (2.0522)  tag_loss: 159.6080 (160.8520)  time: 1.4328 (1.6428)  data: 0.0001 (0.0001)  to_device: 0.0050 (0.0050)  time_gpu: 1.4277 (1.6376)  save_time: 73.3883 (73.3883)  lr: 0.000091  max mem: 26307
2022-03-16 07:22:45,309.309 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 07:22:45,309.309 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.4256591796875
2022-03-16 07:22:45,309.309 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.48733481045427
2022-03-16 07:22:49,720.720 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015526284463703632
2022-03-16 07:22:49,720.720 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:22:49,721.721 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'very', 'large', 'black', 'bear', '[MASK]', '[MASK]', 'the', 'woods', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:22:49,736.736 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'bear', 'grass', 'forest', 'trunk', 'head', 'ground', 'ear', 'wood', 'branch', 'flower', 'leg', 'rock', 'field', 'plant', 'bush', '[UNK]', 'black', 'face', 'leaf', 'area', 'log', 'snout', 'brown', 'nose', 'large', 'fur', 'dirt', 'back', 'water', 'stick', 'cub', 'green', 'hill', 'stump', 'path', 'paw', 'grassy', 'next', 'trail', 'weed', 'fern', 'body', 'wooded', 'walking', 'big', 'bird', 'pine', 'standing', 'hillside']
2022-03-16 07:23:05,672.672 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'black', 'large', 'field', 'forest', 'plant', 'tree', 'wood', 'branch', 'ear', 'bear', 'grass', 'flower', 'trunk', 'foraging']
2022-03-16 07:25:29,257.257 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:41:17  iter: 5800  speed: 311.6 images/sec  total_norm: 127.0965 (131.0839)  loss: 161.9664 (163.6607)  masked_loss: 2.0147 (2.0309)  tag_loss: 159.4530 (161.6298)  time: 1.4333 (1.6432)  data: 0.0002 (0.0002)  to_device: 0.0048 (0.0047)  time_gpu: 1.4285 (1.6383)  save_time: 73.3883 (73.3883)  lr: 0.000091  max mem: 26307
2022-03-16 07:25:29,618.618 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 07:25:29,618.618 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 135.27671813964844
2022-03-16 07:25:29,618.618 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.5447828648454
2022-03-16 07:25:34,061.061 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015603570267558098
2022-03-16 07:25:34,062.062 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:25:34,062.062 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'strawberry', 'milk', 'shake', 'and', 'two', 'straw', '[MASK]', 'on', 'a', 'plate', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:25:34,077.077 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', 'glass', 'strawberry', 'leaf', 'plate', 'stem', 'drink', 'foam', 'base', 'fruit', 'plant', 'top', '[UNK]', 'tray', 'cup', 'shadow', 'flower', 'milk', 'ice', 'container', 'vase', 'wall', 'reflection', 'juice', 'berry', 'napkin', 'white', 'rim', 'window', 'dessert', 'food', 'water', 'liquid', 'coaster', 'banana', 'straw', 'red', 'background', 'light', 'bowl', 'spoon', 'coffee', 'chair', 'next', 'cream', 'handle', 'beverage', 'bottle', 'person', 'jar']
2022-03-16 07:25:49,972.972 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['top', 'table', 'base', 'glass', 'plant', 'background', 'drink', 'plate', 'shadow', 'shake', 'milk', 'leaf', 'stem', 'rim', 'tray', 'strawberry', 'foam']
2022-03-16 07:28:13,684.684 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:38:32  iter: 5900  speed: 311.4 images/sec  total_norm: 129.1343 (130.4509)  loss: 160.3487 (162.0344)  masked_loss: 1.9951 (2.0217)  tag_loss: 159.0124 (160.0128)  time: 1.4331 (1.6442)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4278 (1.6391)  save_time: 73.3883 (73.3883)  lr: 0.000091  max mem: 26307
2022-03-16 07:28:14,044.044 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.375
2022-03-16 07:28:14,045.045 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 159.78659057617188
2022-03-16 07:28:14,045.045 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.49379030863444
2022-03-16 07:28:18,518.518 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015681199729442596
2022-03-16 07:28:18,518.518 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:28:18,519.519 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'professional', 'snow', 'board', 'athlete', '[MASK]', 'flight', 'on', 'their', 'board', 'rangers', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:28:18,534.534 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'sky', 'jacket', 'man', 'helmet', 'glove', 'person', 'arm', 'board', 'building', 'air', 'vest', 'roof', 'hand', 'head', 'letter', 'trick', 'boot', 'coat', 'design', 'logo', 'snow', 'shoe', 'foot', 'face', 'leg', 'hood', 'wall', 'jump', 'sleeve', 'number', 'hat', 'flag', 'top', 'structure', 'yellow', 'ramp', 'stripe', 'shirt', 'fence', 'tree', 'boy', 'sign', 'word', 'writing', 'jean', 'window', 'pole', 'ground', 'wire']
2022-03-16 07:28:34,596.596 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'building', 'board', 'professional', 'person', 'arm', 'foot', 'window', 'flight', 'letter', 'sky', 'roof', 'snow', 'coat', 'jacket', 'logo', 'athlete', 'boot', 'helmet', 'shoe', 'glove', 'vest']
2022-03-16 07:30:58,174.174 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:35:48  iter: 6000  speed: 311.3 images/sec  total_norm: 126.8395 (126.0948)  loss: 157.8949 (159.6793)  masked_loss: 1.9517 (2.0002)  tag_loss: 155.9323 (157.6790)  time: 1.4335 (1.6449)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4285 (1.6400)  save_time: 73.3883 (73.3883)  lr: 0.000091  max mem: 26307
2022-03-16 07:30:58,536.536 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5555555820465088
2022-03-16 07:30:58,536.536 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 153.12246704101562
2022-03-16 07:30:58,537.537 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.49794131419698
2022-03-16 07:31:03,071.071 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015682023018598557
2022-03-16 07:31:03,071.071 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:31:03,072.072 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', '[MASK]', '[MASK]', 'is', 'hanging', 'out', 'by', 'some', 'rocks', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:31:03,087.087 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['rock', 'bear', 'head', 'ear', 'nose', 'fur', 'snout', 'plant', 'ground', 'leg', 'face', 'eye', 'wall', 'weed', 'paw', 'mouth', 'leaf', 'boulder', 'grass', 'shadow', 'back', 'black', 'moss', 'water', 'animal', 'stone', 'tree', 'foot', '[UNK]', 'large', 'brown', 'claw', 'zoo', 'tongue', 'log', 'branch', 'arm', 'bush', 'tail', 'rocky', 'snow', 'dirt', 'trunk', 'polar', 'neck', 'next', 'hair', 'flower', 'small', 'enclosure']
2022-03-16 07:31:19,071.071 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['back', 'head', 'face', 'light', 'ground', 'rock', 'mouth', 'wall', 'eye', 'plant', 'animal', 'tongue', 'ear', 'bear', 'grass', 'tail', 'fur', 'leaf', 'weed']
2022-03-16 07:33:42,598.598 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:33:03  iter: 6100  speed: 311.4 images/sec  total_norm: 124.5278 (127.5351)  loss: 159.5699 (161.6323)  masked_loss: 1.9772 (1.9913)  tag_loss: 157.5625 (159.6409)  time: 1.4335 (1.6443)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4284 (1.6392)  save_time: 73.3883 (73.3883)  lr: 0.000091  max mem: 26307
2022-03-16 07:33:42,958.958 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.65625
2022-03-16 07:33:42,959.959 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 179.55850219726562
2022-03-16 07:33:42,959.959 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.45158533896169
2022-03-16 07:33:47,523.523 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01565631479024887
2022-03-16 07:33:47,523.523 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:33:47,524.524 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'is', 'swinging', 'a', 'bat', 'in', 'a', 'grassy', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:33:47,540.540 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', '[UNK]', 'shirt', 'shoe', 'hand', 'leg', 'man', 'head', 'field', 'hair', 'arm', 'person', 'ground', 'boy', 'face', 'tree', 'fence', 'logo', 'park', 'cap', 'sock', 'ear', 'hat', 'pole', 'jean', 'dirt', 'shadow', 'baseball', 'glove', 'woman', 'belt', 'background', 'nose', 'ball', 'young', 'bat', 'glasses', 'short', 'stripe', 'car', 'building', 'watch', 'mouth', 'sleeve', 'jersey', 'girl', 'sunglasses', 'design', 'line', 'bracelet']
2022-03-16 07:34:03,579.579 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'building', 'field', 'arm', 'window', 'tree', 'shirt', 'leg', 'roof', 'grass', 'hat', 'bat', 'shoe', 'grassy']
2022-03-16 07:36:27,017.017 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:30:19  iter: 6200  speed: 311.4 images/sec  total_norm: 127.5968 (129.2917)  loss: 162.7476 (165.0517)  masked_loss: 1.9685 (2.0431)  tag_loss: 160.4079 (163.0087)  time: 1.4332 (1.6442)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4278 (1.6391)  save_time: 73.3883 (73.3883)  lr: 0.000091  max mem: 26307
2022-03-16 07:36:27,377.377 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6388888955116272
2022-03-16 07:36:27,378.378 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.37649536132812
2022-03-16 07:36:27,378.378 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.51522778707837
2022-03-16 07:36:31,995.995 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015720687806606293
2022-03-16 07:36:31,995.995 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:36:31,996.996 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bathroom', 'with', '[MASK]', 'shower', ',', 'sink', 'and', 'mirror', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:36:32,011.011 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['door', 'wall', 'bathroom', 'floor', '[UNK]', 'knob', 'sink', 'tile', 'shower', 'handle', 'mirror', 'light', 'towel', 'ceiling', 'toilet', 'window', 'cabinet', 'head', 'tub', 'rack', 'reflection', 'picture', 'rug', 'drain', 'soap', 'holder', 'switch', 'outlet', 'curtain', 'bottle', 'lid', 'paper', 'doorway', 'seat', 'drawer', 'shelf', 'lamp', 'room', 'frame', 'can', 'glass', 'white', 'hair', 'tank', 'dish', 'vanity', 'cup', 'vent', 'rod', 'box']
2022-03-16 07:36:47,975.975 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'door', 'light', 'floor', 'wall', 'ring', 'cabinet', 'mirror', 'bathroom', 'ceiling', 'shower', 'pole', 'hallway', 'switch', 'sink', 'rod', 'holder', 'reflection', 'outlet', 'tile', 'tub', 'knob']
2022-03-16 07:39:11,461.461 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:27:34  iter: 6300  speed: 311.4 images/sec  total_norm: 124.5659 (128.2067)  loss: 163.1437 (162.3157)  masked_loss: 2.0132 (2.0708)  tag_loss: 160.6618 (160.2449)  time: 1.4330 (1.6445)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4280 (1.6394)  save_time: 73.3883 (73.3883)  lr: 0.000091  max mem: 26307
2022-03-16 07:39:11,821.821 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 07:39:11,822.822 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 145.84521484375
2022-03-16 07:39:11,822.822 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.56801855564117
2022-03-16 07:39:16,474.474 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0158122256398201
2022-03-16 07:39:16,474.474 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:39:16,475.475 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'ceiling', 'fan', 'is', 'turned', '[MASK]', 'in', 'the', 'kitchen', 'of', 'a', 'house', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:39:16,490.490 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['kitchen', '[UNK]', 'cabinet', 'ceiling', 'wall', 'handle', 'window', 'light', 'refrigerator', 'door', 'stove', 'drawer', 'oven', 'sink', 'floor', 'fan', 'coffee', 'pot', 'bowl', 'towel', 'kettle', 'bottle', 'microwave', 'cup', 'outlet', 'top', 'tile', 'knob', 'maker', 'pitcher', 'container', 'tea', 'mixer', 'lid', 'paper', 'rack', 'mug', 'basket', 'magnet', 'picture', 'counter', 'knife', 'curtain', 'shelf', 'flower', 'chair', 'hood', 'vase', 'jar', 'can']
2022-03-16 07:39:32,547.547 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'house', 'top', 'door', 'light', 'cup', 'floor', 'wall', 'window', 'kitchen', 'picture', 'coffee', 'bowl', 'handle', 'cabinet', 'fan', 'bottle', 'ceiling', 'sink', 'pot', 'maker', 'towel', 'drawer', 'tile', 'banana', 'stove', 'knob', 'oven', 'refrigerator', 'mixer']
2022-03-16 07:41:56,170.170 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:24:52  iter: 6400  speed: 310.9 images/sec  total_norm: 125.1325 (127.5553)  loss: 159.4396 (160.4304)  masked_loss: 1.9751 (2.0432)  tag_loss: 157.7386 (158.3872)  time: 1.4347 (1.6471)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4297 (1.6420)  save_time: 73.3883 (73.3883)  lr: 0.000090  max mem: 26307
2022-03-16 07:41:56,530.530 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-16 07:41:56,530.530 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 140.12315368652344
2022-03-16 07:41:56,530.530 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.60622863769531
2022-03-16 07:42:01,216.216 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01582055166363716
2022-03-16 07:42:01,216.216 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:42:01,217.217 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'man', '[MASK]', 'in', '[MASK]', 'with', 'a', 'a', 'lot', 'of', 'luggage', 'dirt', 'road', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:42:01,232.232 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'sky', 'bag', 'shirt', 'sunglasses', 'short', 'shoe', 'backpack', 'truck', 'leg', 'umbrella', 'shadow', 'head', 'hand', 'watch', 'ground', 'luggage', 'pole', '[UNK]', 'face', 'bench', 'tree', 'hair', 'person', 'jacket', 'road', 'mountain', 'hat', 'arm', 'water', 'wheel', 'cloud', 'tire', 'wall', 'suitcase', 'boat', 'sign', 'grass', 'stripe', 'vehicle', 'light', 'post', 'door', 'window', 'roof', 'group', 'pile', 'strap', 'building', 'dirt']
2022-03-16 07:42:17,243.243 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'road', 'short', 'hair', 'lot', 'watch', 'sky', 'shirt', 'leg', 'vehicle', 'bag', 'truck', 'wheel', 'sand', 'pole', 'jacket', 'dirt', 'pile', 'shoe', 'cart', 'tire', 'umbrella', 'backpack', 'sunglasses', 'luggage', 'vest']
03-16 07:42:24.889 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 07:42:24.889 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 07:42:26.246 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 97}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 07:44:40,880.880 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:22:10  iter: 6500  speed: 310.9 images/sec  total_norm: 126.5109 (127.8441)  loss: 159.2637 (160.3642)  masked_loss: 1.9964 (2.0214)  tag_loss: 157.2803 (158.3429)  time: 1.4337 (1.6471)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4289 (1.6421)  save_time: 73.3883 (73.3883)  lr: 0.000090  max mem: 26307
2022-03-16 07:44:41,241.241 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5277777910232544
2022-03-16 07:44:41,241.241 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 171.2715606689453
2022-03-16 07:44:41,242.242 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.60344106500798
2022-03-16 07:44:45,981.981 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015841631218791008
2022-03-16 07:44:45,981.981 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:44:45,981.981 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'women', 'cutting', 'a', '[MASK]', 'cake', 'with', 'one', 'lit', 'candle', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:44:45,996.996 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['plate', 'wall', 'cake', 'table', 'hair', 'candle', '[UNK]', 'shirt', 'hand', 'head', 'woman', 'man', 'bowl', 'face', 'tie', 'glasses', 'person', 'food', 'light', 'suit', 'napkin', 'jacket', 'ear', 'flower', 'floor', 'stack', 'cup', 'tray', 'knife', 'dress', 'window', 'chair', 'room', 'glass', 'ceiling', 'box', 'curtain', 'pizza', 'display', 'arm', 'dessert', 'nose', 'couple', 'group', 'paper', 'cookie', 'cloth', 'spoon', 'dish', 'coat']
2022-03-16 07:45:01,998.998 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'face', 'large', 'woman', 'hair', 'table', 'wall', 'food', 'chair', 'shirt', 'bowl', 'plate', 'coat', 'knife', 'lit', 'jacket', 'glasses', 'cloth', 'collar', 'cake', 'candle', 'napkin']
2022-03-16 07:47:25,518.518 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:19:28  iter: 6600  speed: 311.0 images/sec  total_norm: 128.4024 (130.6842)  loss: 162.3504 (161.6405)  masked_loss: 1.9337 (2.0026)  tag_loss: 160.5454 (159.6380)  time: 1.4338 (1.6464)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4285 (1.6413)  save_time: 73.3883 (73.3883)  lr: 0.000090  max mem: 26307
2022-03-16 07:47:25,880.880 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5833333134651184
2022-03-16 07:47:25,881.881 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 223.20375061035156
2022-03-16 07:47:25,881.881 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.54124120456069
2022-03-16 07:47:30,661.661 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01588067039847374
2022-03-16 07:47:30,662.662 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:47:30,662.662 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'in', '[MASK]', 'red', 'sweatshirt', 'sitting', 'on', 'the', 'floor', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:47:30,678.678 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'floor', 'hand', 'shirt', 'girl', 'tile', 'ponytail', '[UNK]', 'woman', 'sweater', 'head', 'jean', 'controller', 'arm', 'chair', 'wall', 'shoe', 'rug', 'ear', 'table', 'jacket', 'game', 'carpet', 'face', 'leg', 'sweatshirt', 'box', 'remote', 'person', 'nose', 'man', 'book', 'ribbon', 'cord', 'young', 'sock', 'room', 'sleeve', 'wii', 'stand', 'bag', 'boy', 'child', 'glasses', 'band', 'paper', 'glass', 'foot', 'bowl', 'can']
2022-03-16 07:47:46,615.615 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'red', 'woman', 'board', 'hair', 'girl', 'floor', 'table', 'wall', 'arm', 'chair', 'plant', 'watch', 'jean', 'shirt', 'handle', 'cabinet', 'leaf', 'wrist', 'towel', 'ribbon', 'curtain', 'cord', 'tile', 'sweater', 'magnet', 'oven', 'refrigerator', 'ponytail', 'sweatshirt']
2022-03-16 07:50:10,424.424 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:16:47  iter: 6700  speed: 310.5 images/sec  total_norm: 126.4411 (129.7306)  loss: 160.9938 (164.2005)  masked_loss: 1.9044 (1.9176)  tag_loss: 159.0402 (162.2829)  time: 1.4345 (1.6490)  data: 0.0002 (0.0005)  to_device: 0.0048 (0.0046)  time_gpu: 1.4294 (1.6439)  save_time: 73.3883 (73.3883)  lr: 0.000090  max mem: 26307
2022-03-16 07:50:10,785.785 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-16 07:50:10,785.785 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 177.44815063476562
2022-03-16 07:50:10,785.785 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.57696050756118
2022-03-16 07:50:15,556.556 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01590898633003235
2022-03-16 07:50:15,556.556 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:50:15,556.556 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'living', 'room', 'filled', '[MASK]', 'furniture', 'and', 'a', '[MASK]', 'flat', 'screen', 'tv', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:50:15,572.572 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'room', 'television', 'floor', 'curtain', 'book', 'window', 'picture', 'living', 'ceiling', 'shelf', 'table', '[UNK]', 'couch', 'door', 'stand', 'chair', 'pillow', 'rug', 'coffee', 'screen', 'sofa', 'fireplace', 'toy', 'lamp', 'light', 'bag', 'clock', 'speaker', 'box', 'basket', 'center', 'entertainment', 'blanket', 'doorway', 'frame', 'cabinet', 'paper', 'tv', 'shade', 'remote', 'magazine', 'candle', 'plant', 'cup', 'ottoman', 'cushion', 'dog', 'vase', 'mirror']
2022-03-16 07:50:31,476.476 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'small', 'room', 'book', 'door', 'center', 'light', 'living', 'television', 'board', 'floor', 'star', 'table', 'wall', 'magazine', 'stand', 'chair', 'paper', 'window', 'box', 'ball', 'sign', 'picture', 'screen', 'entertainment', 'animal', 'coffee', 'painting', 'flat', 'ceiling', 'furniture', 'toy', 'pillow', 'basket', 'curtain', 'shelf', 'laptop', 'fireplace', 'mantle', 'rug']
2022-03-16 07:52:55,322.322 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:14:06  iter: 6800  speed: 310.5 images/sec  total_norm: 125.8012 (128.2865)  loss: 161.8088 (163.0958)  masked_loss: 1.8929 (1.9011)  tag_loss: 159.5959 (161.1946)  time: 1.4348 (1.6490)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4295 (1.6439)  save_time: 73.3883 (73.3883)  lr: 0.000090  max mem: 26307
2022-03-16 07:52:55,683.683 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 07:52:55,683.683 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 188.1556396484375
2022-03-16 07:52:55,683.683 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.4946517944336
2022-03-16 07:53:00,509.509 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.015924029052257538
2022-03-16 07:53:00,510.510 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:53:00,510.510 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'a', 'women', 'who', 'are', '[MASK]', 'by', 'the', 'street', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:53:00,525.525 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'shirt', 'woman', 'sky', 'sunglasses', 'hair', 'roof', 'grass', 'window', 'road', 'man', 'building', 'head', 'truck', 'person', 'hand', 'watch', 'pole', 'tire', 'top', 'street', 'face', 'tank', 'wire', 'arm', 'bus', 'line', 'house', 'car', 'strap', 'bag', '[UNK]', 'purse', 'stripe', 'windshield', 'sign', 'wheel', 'necklace', 'curb', 'phone', 'shadow', 'couple', 'writing', 'dress', 'lady', 'wrist', 'number', 'front', 'light', 'logo']
2022-03-16 07:53:16,478.478 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'house', 'hand', 'number', 'face', 'line', 'building', 'road', 'street', 'woman', 'hair', 'mouth', 'person', 'arm', 'window', 'tree', 'watch', 'sky', 'jean', 'shirt', 'bus', 'roof', 'bag', 'truck', 'shadow', 'grass', 'wire', 'trunk', 'tire', 'necklace', 'backpack', 'curb', 'strap', 'sunglasses']
2022-03-16 07:55:40,274.274 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:11:26  iter: 6900  speed: 310.4 images/sec  total_norm: 126.1319 (128.3255)  loss: 157.1677 (158.2721)  masked_loss: 1.9626 (1.9616)  tag_loss: 155.5383 (156.3105)  time: 1.4337 (1.6495)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4287 (1.6446)  save_time: 73.3883 (73.3883)  lr: 0.000090  max mem: 26307
2022-03-16 07:55:40,635.635 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.47058823704719543
2022-03-16 07:55:40,635.635 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 150.71939086914062
2022-03-16 07:55:40,635.635 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.5934708731515
2022-03-16 07:55:45,538.538 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01597350835800171
2022-03-16 07:55:45,538.538 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:55:45,538.538 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'some', 'people', 'in', 'a', '[MASK]', 'preparing', '[MASK]', 'near', 'an', 'oven', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:55:45,554.554 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', '[UNK]', 'hair', 'man', 'woman', 'wall', 'floor', 'sign', 'person', 'glasses', 'ceiling', 'cart', 'oven', 'head', 'kitchen', 'apron', 'hand', 'tray', 'food', 'door', 'tile', 'shelf', 'handle', 'container', 'light', 'face', 'vent', 'watch', 'arm', 'lady', 'rack', 'box', 'machine', 'sunglasses', 'pan', 'wheel', 'ear', 'shoe', 'bag', 'chef', 'pole', 'hat', 'grill', 'jean', 'plate', 'bowl', 'stove', 'bin', 'table', 'window']
2022-03-16 07:56:01,532.532 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'door', 'woman', 'hair', 'person', 'floor', 'table', 'wall', 'food', 'box', 'sign', 'machine', 'shirt', 'kitchen', 'handle', 'bottle', 'ceiling', 'glasses', 'shoe', 'cart', 'shelf', 'tray', 'rack', 'oven', 'apron']
2022-03-16 07:58:25,238.238 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:08:46  iter: 7000  speed: 310.4 images/sec  total_norm: 126.6466 (131.0433)  loss: 161.2748 (160.4148)  masked_loss: 1.8556 (1.9263)  tag_loss: 158.1733 (158.4885)  time: 1.4348 (1.6496)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4298 (1.6446)  save_time: 73.3883 (73.3883)  lr: 0.000089  max mem: 26307
2022-03-16 07:58:25,600.600 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6571428775787354
2022-03-16 07:58:25,600.600 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 136.92257690429688
2022-03-16 07:58:25,600.600 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.57982828919317
2022-03-16 07:58:30,510.510 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016118109226226807
2022-03-16 07:58:30,510.510 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 07:58:30,510.510 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'pizza', 'on', 'a', 'table', '[MASK]', 'a', 'bowl', 'of', 'grapes', '[MASK]', 'drinks', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 07:58:30,526.526 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', 'plate', 'glass', 'pizza', 'olive', 'fork', '[UNK]', 'candle', 'water', 'napkin', 'food', 'onion', 'bowl', 'cheese', 'cup', 'grape', 'beer', 'knife', 'paper', 'liquid', 'restaurant', 'pea', 'crust', 'ham', 'menu', 'person', 'hand', 'straw', 'pepper', 'drink', 'coaster', 'bottle', 'handle', 'meat', 'receipt', 'leaf', 'salt', 'spoon', 'slice', 'light', 'white', 'wine', 'shirt', 'chair', 'tomato', 'vegetable', 'bread', 'butter', 'dish', 'logo']
2022-03-16 07:58:46,540.540 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'water', 'cup', 'table', 'glass', 'bowl', 'plate', 'bottle', 'leaf', 'fork', 'olive', 'pizza', 'candle', 'grape', 'beverage', 'napkin', 'onion']
2022-03-16 08:01:10,165.165 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:06:05  iter: 7100  speed: 310.4 images/sec  total_norm: 124.5199 (126.7206)  loss: 159.0376 (159.8691)  masked_loss: 1.9311 (1.9475)  tag_loss: 157.2568 (157.9216)  time: 1.4348 (1.6492)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4296 (1.6442)  save_time: 73.3883 (73.3883)  lr: 0.000089  max mem: 26307
2022-03-16 08:01:10,528.528 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6285714507102966
2022-03-16 08:01:10,528.528 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.12545776367188
2022-03-16 08:01:10,528.528 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.59850809309218
2022-03-16 08:01:15,496.496 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01621018908917904
2022-03-16 08:01:15,496.496 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:01:15,497.497 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'baseball', '[MASK]', 'about', 'to', 'swing', '[MASK]', 'a', 'baseball', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:01:15,512.512 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'man', '[UNK]', 'shirt', 'sock', 'bat', 'sky', 'baseball', 'belt', 'field', 'shoe', 'hat', 'player', 'leg', 'person', 'grass', 'ball', 'glove', 'head', 'batter', 'hand', 'umpire', 'uniform', 'ground', 'cap', 'park', 'fence', 'arm', 'plate', 'cloud', 'photo', 'shadow', 'helmet', 'black', 'white', 'net', 'building', 'catcher', 'base', 'foot', 'game', 'jersey', 'photograph', 'home', 'couple', 'jacket', 'bench', 'mask', 'ready', 'boot']
2022-03-16 08:01:31,536.536 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'player', 'field', 'ground', 'person', 'tree', 'baseball', 'ball', 'sky', 'shirt', 'leg', 'grass', 'belt', 'hat', 'cap', 'bat', 'shoe', 'umpire', 'batter', 'sock']
2022-03-16 08:03:55,159.159 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:03:25  iter: 7200  speed: 310.3 images/sec  total_norm: 127.4923 (132.1943)  loss: 163.9954 (162.7163)  masked_loss: 1.9071 (1.9623)  tag_loss: 161.7013 (160.7541)  time: 1.4345 (1.6500)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4295 (1.6450)  save_time: 73.3883 (73.3883)  lr: 0.000089  max mem: 26307
2022-03-16 08:03:55,521.521 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.53125
2022-03-16 08:03:55,522.522 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 142.30935668945312
2022-03-16 08:03:55,522.522 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.64038702559797
2022-03-16 08:04:00,523.523 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016358964145183563
2022-03-16 08:04:00,523.523 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:04:00,523.523 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', '[MASK]', 'flying', '[MASK]', 'the', 'city', 'of', 'paris', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:04:00,539.539 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'building', 'tail', 'airplane', 'tree', 'city', 'wing', 'window', 'cloud', 'engine', 'logo', 'cockpit', 'nose', 'water', 'wheel', 'door', 'airport', '[UNK]', 'roof', 'tower', 'plane', 'large', 'stripe', 'car', 'letter', 'fuselage', 'air', 'landing', 'bridge', 'road', 'mountain', 'bush', 'house', 'fence', 'background', 'crane', 'grass', 'light', 'jet', 'sign', 'gear', 'horizon', 'skyscraper', 'person', 'front', 'wall', 'top', 'pole', 'flower', 'runway']
2022-03-16 08:04:16,518.518 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'building', 'door', 'base', 'hill', 'mountain', 'engine', 'airport', 'distance', 'window', 'wing', 'tree', 'tower', 'sky', 'roof', 'nose', 'tail', 'cloud', 'logo', 'horizon', 'airplane', 'cockpit', 'spire']
2022-03-16 08:06:40,227.227 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:00:45  iter: 7300  speed: 310.2 images/sec  total_norm: 126.1608 (127.5308)  loss: 158.6285 (160.2556)  masked_loss: 1.9067 (1.9713)  tag_loss: 156.8830 (158.2843)  time: 1.4351 (1.6507)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4300 (1.6456)  save_time: 73.3883 (73.3883)  lr: 0.000089  max mem: 26307
2022-03-16 08:06:40,589.589 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.53125
2022-03-16 08:06:40,589.589 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 173.15765380859375
2022-03-16 08:06:40,590.590 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.67391998703415
2022-03-16 08:06:45,646.646 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016344014555215836
2022-03-16 08:06:45,647.647 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:06:45,647.647 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'is', '##landa', 'on', 'the', 'hood', 'of', 'a', 'small', 'car', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:06:45,662.662 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['car', 'cat', 'bush', 'windshield', '[UNK]', 'window', 'grass', 'tree', 'mirror', 'hood', 'building', 'plant', 'road', 'reflection', 'ear', 'sky', 'door', 'ground', 'curb', 'pole', 'light', 'head', 'tire', 'tail', 'photo', 'wheel', 'fence', 'roof', 'house', 'white', 'paw', 'sidewalk', 'handle', 'leaf', 'wall', 'license', 'shadow', 'weed', 'trunk', 'black', 'line', 'bumper', 'truck', 'sign', 'logo', 'dog', 'flower', 'leg', 'animal', 'street']
2022-03-16 08:07:01,748.748 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'small', 'building', 'light', 'car', 'wall', 'plant', 'tree', 'clothes', 'cat', 'plate', 'bush', 'license', 'photo', 'flower', 'leaf', 'hood', 'logo', 'cloth', 'fence', 'sidewalk', 'windshield']
2022-03-16 08:09:25,349.349 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:58:05  iter: 7400  speed: 310.1 images/sec  total_norm: 127.8721 (131.9460)  loss: 162.8023 (162.5743)  masked_loss: 1.9534 (1.9888)  tag_loss: 160.7738 (160.5855)  time: 1.4335 (1.6512)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4282 (1.6461)  save_time: 73.3883 (73.3883)  lr: 0.000089  max mem: 26307
2022-03-16 08:09:25,710.710 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-16 08:09:25,710.710 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 126.70350646972656
2022-03-16 08:09:25,710.710 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.70002421061199
2022-03-16 08:09:30,979.979 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016315054148435593
2022-03-16 08:09:30,979.979 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:09:30,980.980 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'women', '[MASK]', 'in', 'a', 'room', 'holding', 'wine', 'glasses', ',', '[MASK]', 'other', 'people', 'behind', 'them', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:09:30,995.995 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'glass', 'hair', 'woman', 'man', 'person', 'hand', 'wine', 'wall', 'scarf', 'glasses', 'paper', 'ceiling', 'face', 'light', 'window', 'head', 'room', 'jacket', '[UNK]', 'jean', 'watch', 'group', 'napkin', 'purse', 'book', 'table', 'bag', 'sweater', 'pillar', 'arm', 'picture', 'bottle', 'column', 'floor', 'lady', 'neck', 'ring', 'menu', 'door', 'strap', 'folder', 'necklace', 'staircase', 'ear', 'beard', 'smile', 'short', 'chair', 'nose']
2022-03-16 08:09:47,003.003 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'other', 'head', 'man', 'hand', 'face', 'room', 'light', 'woman', 'hair', 'girl', 'person', 'floor', 'table', 'wall', 'phone', 'glass', 'paper', 'window', 'cell', 'ring', 'shirt', 'wine', 'bag', 'coat', 'ceiling', 'jacket', 'glasses', 'necklace', 'poster', 'candle', 'scarf', 'folder']
2022-03-16 08:12:10,561.561 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:55:26  iter: 7500  speed: 309.9 images/sec  total_norm: 124.2984 (126.4991)  loss: 162.5713 (162.9374)  masked_loss: 1.8916 (1.8848)  tag_loss: 160.5367 (161.0526)  time: 1.4337 (1.6521)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4285 (1.6470)  save_time: 73.3883 (73.3883)  lr: 0.000089  max mem: 26307
2022-03-16 08:12:10,922.922 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 08:12:10,922.922 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.9164581298828
2022-03-16 08:12:10,922.922 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.75977787218596
2022-03-16 08:12:16,090.090 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0163092240691185
2022-03-16 08:12:16,090.090 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:12:16,090.090 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'dog', 'that', 'is', 'looking', 'at', 'protected', 'herd', 'of', 'sheep', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:12:16,105.105 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sheep', 'head', 'grass', 'hill', 'field', 'ear', 'rock', 'herd', 'face', 'leg', 'tail', 'group', '[UNK]', 'mountain', 'hillside', 'ground', 'sky', 'lamb', 'animal', 'goat', 'wool', 'tree', 'bush', 'eye', 'green', 'nose', 'grassy', 'tag', 'horn', 'fence', 'large', 'other', 'grazing', 'water', 'flock', 'person', 'road', 'bunch', 'open', 'rocky', 'snow', 'white', 'number', 'next', 'pasture', 'dirt', 'gravel', 'landscape', 'background', 'dog']
03-16 08:12:26.260 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 08:12:26.260 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 08:12:26.925 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 10}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 10}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 9}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}]
2022-03-16 08:12:32,117.117 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'face', 'field', 'ground', 'rock', 'hill', 'dog', 'ear', 'grass', 'tail', 'sheep', 'herd']
2022-03-16 08:14:55,723.723 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:52:47  iter: 7600  speed: 310.0 images/sec  total_norm: 131.4668 (133.6689)  loss: 156.9714 (157.9888)  masked_loss: 1.8445 (1.9087)  tag_loss: 154.9431 (156.0801)  time: 1.4343 (1.6516)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0047)  time_gpu: 1.4292 (1.6467)  save_time: 73.3883 (73.3883)  lr: 0.000089  max mem: 26307
2022-03-16 08:14:56,084.084 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-16 08:14:56,085.085 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 169.597412109375
2022-03-16 08:14:56,085.085 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.76016156085126
2022-03-16 08:15:01,272.272 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016325822100043297
2022-03-16 08:15:01,272.272 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:15:01,273.273 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'holding', 'a', 'fr', '##is', '##bee', '[MASK]', 'to', 'a', 'boy', 'in', 'front', 'of', 'him', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:15:01,288.288 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'man', 'short', 'sky', 'tree', 'grass', 'head', 'hair', 'logo', 'field', 'hat', '[UNK]', 'hand', 'fence', 'face', 'cap', 'sunglasses', 'leg', 'stripe', 'arm', 'ear', 'line', 'glove', 'watch', 'glasses', 'sock', 'park', 'pole', 'shoe', 'wire', 'bush', 'boy', 'cloud', 'jersey', 'post', 'house', 'game', 'person', 'uniform', 'ground', 'cone', 'design', 'young', 'grassy', 'nose', 'knee', 'number', 'band', 'background', 'sleeve']
2022-03-16 08:15:17,254.254 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'line', 'power', 'front', 'short', 'field', 'hair', 'post', 'person', 'arm', 'boy', 'tree', 'watch', 'sky', 'shirt', 'leg', 'nose', 'ear', 'grass', 'hat', 'cap', 'wrist', 'logo', 'fence', 'glove', 'stripe']
2022-03-16 08:17:40,798.798 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:50:07  iter: 7700  speed: 310.2 images/sec  total_norm: 126.8387 (131.1606)  loss: 160.7193 (161.1674)  masked_loss: 2.0412 (1.9750)  tag_loss: 158.6781 (159.1924)  time: 1.4340 (1.6508)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4289 (1.6457)  save_time: 73.3883 (73.3883)  lr: 0.000088  max mem: 26307
2022-03-16 08:17:41,159.159 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5625
2022-03-16 08:17:41,160.160 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.6061248779297
2022-03-16 08:17:41,160.160 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.83692081157977
2022-03-16 08:17:46,362.362 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016308940947055817
2022-03-16 08:17:46,362.362 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:17:46,362.362 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'young', '[MASK]', 'on', 'a', 'skate', '##board', 'doing', 'tricks', 'on', 'the', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:17:46,378.378 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'grass', 'hand', 'window', 'man', 'arm', 'head', '[UNK]', 'shirt', 'sky', 'tree', 'sidewalk', 'hat', 'hair', 'light', 'person', 'ground', 'road', 'pole', 'boy', 'jean', 'sweater', 'car', 'jacket', 'street', 'house', 'shoe', 'roof', 'sign', 'leg', 'shadow', 'wall', 'park', 'face', 'bush', 'sweatshirt', 'cap', 'door', 'rock', 'city', 'curb', 'line', 'fence', 'coat', 'cloud', 'wheel', 'woman', 'glove', 'sleeve', 'balcony']
2022-03-16 08:18:02,365.365 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'building', 'park', 'young', 'light', 'ground', 'rock', 'arm', 'boy', 'window', 'tree', 'sky', 'jean', 'leg', 'bell', 'shadow', 'wheel', 'grass', 'hat', 'pole', 'lamp', 'shoe', 'cement', 'sidewalk', 'boulder', 'sweater']
2022-03-16 08:20:26,093.093 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:47:28  iter: 7800  speed: 309.8 images/sec  total_norm: 126.0564 (129.4153)  loss: 162.3647 (163.3301)  masked_loss: 1.9442 (1.9351)  tag_loss: 160.4022 (161.3950)  time: 1.4337 (1.6529)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0049)  time_gpu: 1.4284 (1.6475)  save_time: 73.3883 (73.3883)  lr: 0.000088  max mem: 26307
2022-03-16 08:20:26,455.455 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-16 08:20:26,456.456 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 142.39051818847656
2022-03-16 08:20:26,456.456 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.87405501739889
2022-03-16 08:20:31,738.738 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016296187415719032
2022-03-16 08:20:31,739.739 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:20:31,739.739 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'couple', 'of', 'cats', 'that', '[MASK]', 'sitting', 'on', '[MASK]', 'fence', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:20:31,754.754 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cat', 'ear', 'tail', 'head', 'bowl', 'grass', 'fence', 'leg', 'paw', 'wood', 'body', 'wall', 'plant', 'eye', 'leaf', 'post', '[UNK]', 'bench', 'tree', 'board', 'wooden', 'ledge', 'fur', 'neck', 'back', 'top', 'table', 'light', 'pot', 'window', 'flower', 'black', 'trunk', 'nose', 'sun', 'bush', 'collar', 'brown', 'water', 'large', 'small', 'branch', 'white', 'animal', 'container', 'weed', 'face', 'field', 'bucket', 'next']
2022-03-16 08:20:47,827.827 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'post', 'wall', 'couple', 'nose', 'ear', 'bowl', 'cat', 'grass', 'tail', 'flower', 'fence', 'ledge', 'paw']
2022-03-16 08:23:11,304.304 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:44:48  iter: 7900  speed: 309.9 images/sec  total_norm: 123.3705 (125.8484)  loss: 159.1401 (160.0956)  masked_loss: 1.9052 (1.9651)  tag_loss: 157.3639 (158.1305)  time: 1.4332 (1.6522)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4280 (1.6472)  save_time: 73.3883 (73.3883)  lr: 0.000088  max mem: 26307
2022-03-16 08:23:11,665.665 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-16 08:23:11,665.665 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 176.79196166992188
2022-03-16 08:23:11,665.665 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.89437084197998
2022-03-16 08:23:16,964.964 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01640871912240982
2022-03-16 08:23:16,964.964 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:23:16,965.965 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'woman', '[MASK]', 'an', 'umbrella', 'over', 'another', 'man', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:23:16,980.980 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'man', 'hand', 'head', 'building', 'hair', 'face', 'nose', 'arm', 'wall', 'ear', 'shirt', '[UNK]', 'mouth', 'eye', 'jacket', 'phone', 'glasses', 'person', 'window', 'sleeve', 'bush', 'cell', 'collar', 'woman', 'hat', 'beard', 'pole', 'door', 'house', 'table', 'finger', 'watch', 'brick', 'fence', 'bench', 'sweater', 'cap', 'sunglasses', 'sign', 'jean', 'handle', 'logo', 'camera', 'trunk', 'chair', 'boy', 'roof', 'neck', 'grass']
2022-03-16 08:23:32,994.994 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'face', 'woman', 'hair', 'person', 'arm', 'neck', 'foot', 'tree', 'shirt', 'nose', 'suit', 'coat', 'hat', 'button', 'jacket', 'glasses', 'umbrella']
2022-03-16 08:25:56,533.533 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:42:08  iter: 8000  speed: 309.9 images/sec  total_norm: 128.0208 (127.7262)  loss: 162.2748 (163.2440)  masked_loss: 1.9377 (1.9987)  tag_loss: 160.7706 (161.2453)  time: 1.4342 (1.6523)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4290 (1.6471)  save_time: 73.3883 (73.3883)  lr: 0.000088  max mem: 26307
2022-03-16 08:25:56,894.894 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4545454680919647
2022-03-16 08:25:56,894.894 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.53936767578125
2022-03-16 08:25:56,895.895 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.90647916440611
2022-03-16 08:26:02,244.244 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0164373479783535
2022-03-16 08:26:02,244.244 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:26:02,245.245 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'team', 'of', 'baseball', 'players', 'playing', 'a', 'game', 'of', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:26:02,260.260 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['field', '[UNK]', 'stand', 'player', 'shirt', 'man', 'helmet', 'umpire', 'person', 'catcher', 'uniform', 'bat', 'batter', 'line', 'crowd', 'grass', 'shoe', 'baseball', 'wall', 'stadium', 'fence', 'chair', 'jersey', 'plate', 'dirt', 'hat', 'home', 'sign', 'game', 'head', 'glove', 'pitcher', 'spectator', 'number', 'cap', 'net', 'hedge', 'mound', 'leg', 'stair', 'belt', 'mask', 'ball', 'hand', 'base', 'pitchers', 'group', 'pole', 'back', 'sock']
2022-03-16 08:26:18,261.261 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'team', 'man', 'home', 'game', 'line', 'player', 'field', 'person', 'wall', 'base', 'stand', 'stadium', 'baseball', 'sign', 'shirt', 'jersey', 'leg', 'crowd', 'plate', 'grass', 'hat', 'uniform', 'dirt', 'bat', 'logo', 'fence', 'helmet', 'shoe', 'catcher', 'glove', 'hedge', 'umpire', 'spectator', 'batter']
2022-03-16 08:28:41,675.675 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:39:28  iter: 8100  speed: 310.0 images/sec  total_norm: 130.1164 (130.8522)  loss: 162.8019 (160.7357)  masked_loss: 1.8161 (1.8873)  tag_loss: 160.8265 (158.8485)  time: 1.4323 (1.6514)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4271 (1.6463)  save_time: 73.3883 (73.3883)  lr: 0.000088  max mem: 26307
2022-03-16 08:28:42,039.039 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5555555820465088
2022-03-16 08:28:42,040.040 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 153.68116760253906
2022-03-16 08:28:42,040.040 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.92798046949433
2022-03-16 08:28:47,429.429 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01645222119987011
2022-03-16 08:28:47,429.429 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:28:47,429.429 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'before', '[MASK]', '##nished', '[MASK]', 'panels', 'is', 'a', 'young', 'man', 'in', 'a', 'gray', 'striped', 'dress', 'shirt', ',', '[MASK]', 'tie', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:28:47,445.445 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'man', 'wall', 'hair', 'head', '[UNK]', 'face', 'nose', 'hand', 'ear', 'door', 'eye', 'tie', 'mouth', 'arm', 'collar', 'glasses', 'cabinet', 'belt', 'floor', 'suit', 'picture', 'room', 'chair', 'knob', 'table', 'woman', 'ceiling', 'shelf', 'bottle', 'light', 'window', 'handle', 'beard', 'person', 'watch', 'jean', 'jacket', 'glass', 'phone', 'neck', 'paper', 'mustache', 'button', 'cup', 'box', 'mirror', 'sign', 'sleeve', 'leg']
2022-03-16 08:29:03,462.462 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'black', 'door', 'young', 'dark', 'hair', 'mouth', 'wall', 'eye', 'wood', 'shirt', 'gray', 'dress', 'nose', 'ear', 'handle', 'tie', 'belt', 'knob']
2022-03-16 08:31:26,838.838 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:36:48  iter: 8200  speed: 310.0 images/sec  total_norm: 129.3882 (130.4708)  loss: 158.9577 (161.8745)  masked_loss: 1.9345 (1.9505)  tag_loss: 156.9890 (159.9239)  time: 1.4317 (1.6517)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4269 (1.6467)  save_time: 73.3883 (73.3883)  lr: 0.000088  max mem: 26307
2022-03-16 08:31:27,201.201 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-16 08:31:27,202.202 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 149.48638916015625
2022-03-16 08:31:27,202.202 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.98289324289345
2022-03-16 08:31:32,588.588 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016465699300169945
2022-03-16 08:31:32,589.589 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:31:32,589.589 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'taking', 'a', 'picture', 'of', 'himself', 'with', 'something', 'in', 'his', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:31:32,604.604 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'door', 'hair', 'man', 'hand', '[UNK]', 'glasses', 'arm', 'phone', 'face', 'head', 'window', 'wall', 'cell', 'ear', 'pillar', 'room', 'tree', 'ceiling', 'handle', 'camera', 'column', 'design', 'floor', 'knob', 'jean', 'bush', 'picture', 'nose', 'light', 'sleeve', 'short', 'elbow', 'blind', 'leg', 'house', 'logo', 'screen', 'cabinet', 'switch', 'tile', 'chair', 'plant', 'eye', 'outside', 'young', 'post', 'doorway', 'mouth', 'person']
2022-03-16 08:31:48,637.637 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'something', 'door', 'short', 'hair', 'mouth', 'wall', 'arm', 'phone', 'window', 'image', 'shirt', 'picture', 'ear', 'handle', 'bush', 'glasses', 'elbow', 'closet', 'pillow', 'bicycle', 'towel', 'straw', 'pillar', 'knob']
2022-03-16 08:34:12,189.189 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:34:09  iter: 8300  speed: 309.6 images/sec  total_norm: 128.5603 (131.6583)  loss: 157.4153 (158.8804)  masked_loss: 1.8459 (1.8920)  tag_loss: 155.8786 (156.9883)  time: 1.4335 (1.6535)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4284 (1.6484)  save_time: 73.3883 (73.3883)  lr: 0.000088  max mem: 26307
2022-03-16 08:34:12,552.552 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.625
2022-03-16 08:34:12,552.552 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 182.23329162597656
2022-03-16 08:34:12,552.552 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.98092814854213
2022-03-16 08:34:18,007.007 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016450172290205956
2022-03-16 08:34:18,007.007 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:34:18,008.008 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'many', 'different', 'clocks', 'and', 'different', 'time', 'zones', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:34:18,023.023 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['clock', 'hand', 'number', 'letter', 'wall', 'word', 'face', 'shadow', '[UNK]', 'reflection', 'writing', 'ceiling', 'light', 'floor', 'display', 'handle', 'sign', 'logo', 'base', 'stand', 'man', 'window', 'person', 'white', 'tile', 'table', 'mirror', 'top', 'paper', 'circle', 'box', 'shirt', 'line', 'door', 'head', 'car', 'cord', 'large', 'frame', 'woman', 'arrow', 'arm', 'lettering', 'room', 'platform', 'design', 'hair', 'leg', 'pole', 'name']
2022-03-16 08:34:34,076.076 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['time', 'many', 'name', 'hand', 'number', 'face', 'different', 'word', 'wall', 'letter', 'clock']
2022-03-16 08:36:57,548.548 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:31:29  iter: 8400  speed: 309.6 images/sec  total_norm: 129.6379 (134.4788)  loss: 157.8736 (160.3304)  masked_loss: 1.8816 (1.9116)  tag_loss: 155.9777 (158.4188)  time: 1.4333 (1.6536)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4282 (1.6485)  save_time: 73.3883 (73.3883)  lr: 0.000087  max mem: 26307
2022-03-16 08:36:57,912.912 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-16 08:36:57,912.912 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 173.57479858398438
2022-03-16 08:36:57,912.912 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.94503371294807
2022-03-16 08:37:03,405.405 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01646980084478855
2022-03-16 08:37:03,405.405 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:37:03,405.405 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'group', '[MASK]', 'young', 'men', 'playing', 'a', 'game', 'of', 'soccer', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:37:03,421.421 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'shirt', 'tree', 'short', 'hair', 'sock', 'shoe', 'ball', 'ground', 'grass', 'hand', 'head', '[UNK]', 'tank', 'pole', 'leg', 'vest', 'trunk', 'arm', 'park', 'soccer', 'sidewalk', 'person', 'top', 'stripe', 'leaf', 'boy', 'face', 'jean', 'fence', 'bottle', 'field', 'bag', 'guy', 'background', 'bench', 'jacket', 'car', 'foot', 'rock', 'phone', 'hat', 'dirt', 'couple', 'trash', 'tire', 'ear', 'beard', 'male', 'cup']
2022-03-16 08:37:19,392.392 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'group', 'hand', 'game', 'top', 'young', 'cup', 'short', 'ground', 'hair', 'person', 'arm', 'tree', 'ball', 'jean', 'shirt', 'leg', 'soccer', 'tank', 'grass', 'pole', 'helmet', 'shoe', 'vest', 'sock']
2022-03-16 08:39:43,055.055 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:28:51  iter: 8500  speed: 309.4 images/sec  total_norm: 127.0314 (128.9566)  loss: 159.1973 (158.7116)  masked_loss: 1.8547 (1.8785)  tag_loss: 157.5667 (156.8330)  time: 1.4340 (1.6551)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4289 (1.6499)  save_time: 73.3883 (73.3883)  lr: 0.000087  max mem: 26307
2022-03-16 08:39:43,417.417 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.53125
2022-03-16 08:39:43,418.418 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 133.6230010986328
2022-03-16 08:39:43,418.418 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 68.9906919612441
2022-03-16 08:39:48,967.967 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016516124829649925
2022-03-16 08:39:48,967.967 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:39:48,968.968 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'multiple', 'clouds', 'in', 'a', 'field', 'on', 'a', 'cloudy', 'day', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:39:48,983.983 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'cow', 'tree', 'head', 'sky', 'shadow', 'leg', 'ear', 'field', 'face', 'trunk', 'nose', 'tail', 'leaf', 'cloud', 'horn', '[UNK]', 'green', 'animal', 'group', 'water', 'cattle', 'herd', 'pasture', 'ground', 'grassy', 'bull', 'spot', 'post', 'rock', 'sheep', 'eye', 'plant', 'background', 'stick', 'fence', 'calf', 'mouth', 'brown', 'flower', 'lush', 'bush', 'tag', 'grazing', 'white', 'horizon', 'hill', 'distance', 'pole', 'dog']
2022-03-16 08:40:04,993.993 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'day', 'face', 'field', 'mouth', 'eye', 'tree', 'sky', 'leg', 'nose', 'ear', 'shadow', 'grass', 'tail', 'cloud', 'horn', 'trunk', 'cow', 'cloudy']
03-16 08:42:27.025 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 08:42:27.025 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 08:42:28.277 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 94}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}]
2022-03-16 08:42:28,526.526 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:26:12  iter: 8600  speed: 309.4 images/sec  total_norm: 126.1418 (129.9278)  loss: 156.5067 (159.4967)  masked_loss: 1.9182 (1.9152)  tag_loss: 154.5516 (157.5815)  time: 1.4339 (1.6546)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4288 (1.6494)  save_time: 73.3883 (73.3883)  lr: 0.000087  max mem: 26307
2022-03-16 08:42:28,887.887 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-16 08:42:28,888.888 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 137.04002380371094
2022-03-16 08:42:28,888.888 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.07255326194324
2022-03-16 08:42:34,486.486 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016567859798669815
2022-03-16 08:42:34,487.487 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:42:34,487.487 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'black', 'cat', '[MASK]', 'to', '[MASK]', 'photograph', 'of', 'a', 'cat', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:42:34,502.502 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['ear', 'cat', 'wall', 'head', 'frame', 'picture', 'eye', 'tail', 'carpet', '[UNK]', 'mirror', 'black', 'floor', 'face', 'nose', 'room', 'table', 'man', 'reflection', 'photo', 'leg', 'shadow', 'window', 'paw', 'door', 'white', 'light', 'body', 'fur', 'photograph', 'painting', 'curtain', 'front', 'neck', 'next', 'woman', 'couple', 'animal', 'dog', 'book', 'ceiling', 'shelf', 'collar', 'rug', 'ground', 'mouth', 'flower', 'handle', 'person', 'top']
2022-03-16 08:42:50,487.487 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'room', 'black', 'floor', 'wall', 'eye', 'picture', 'dog', 'animal', 'ear', 'frame', 'cat', 'shadow', 'tail', 'photograph', 'carpet']
2022-03-16 08:45:14,105.105 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:23:34  iter: 8700  speed: 309.2 images/sec  total_norm: 127.0476 (128.9059)  loss: 158.8764 (161.1638)  masked_loss: 1.7965 (1.8851)  tag_loss: 157.1656 (159.2787)  time: 1.4347 (1.6559)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4295 (1.6507)  save_time: 73.3883 (73.3883)  lr: 0.000087  max mem: 26307
2022-03-16 08:45:14,466.466 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 08:45:14,466.466 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 191.82302856445312
2022-03-16 08:45:14,466.466 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.0547200983221
2022-03-16 08:45:20,049.049 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016604948788881302
2022-03-16 08:45:20,049.049 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:45:20,049.049 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '[MASK]', 'a', '[MASK]', '##d', 'in', 'enclosure', 'eats', 'from', 'the', 'ground', 'beneath', 'a', 'tree', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:45:20,065.065 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['ground', 'zebra', 'leg', 'grass', 'tail', 'pole', 'shadow', 'tree', 'head', 'fence', 'stripe', 'trunk', 'dirt', '[UNK]', 'mane', 'ear', 'hay', 'post', 'neck', 'wire', 'rock', 'enclosure', 'nose', 'zoo', 'bush', 'stick', 'pen', 'feeder', 'branch', 'straw', 'log', 'mouth', 'leaf', 'mesh', 'rope', 'area', 'next', 'stump', 'building', 'animal', 'road', 'shade', 'wall', 'hair', 'basket', 'trough', 'other', 'field', 'foot', 'hose']
2022-03-16 08:45:36,052.052 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'ground', 'post', 'neck', 'tree', 'wood', 'branch', 'leg', 'shadow', 'grass', 'tail', 'dirt', 'wire', 'rope', 'trunk', 'fence', 'log', 'enclosure', 'stripe', 'mesh', 'mane', 'netting', 'zebra']
2022-03-16 08:47:59,615.615 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:20:55  iter: 8800  speed: 309.3 images/sec  total_norm: 128.2200 (128.9056)  loss: 161.5915 (159.8988)  masked_loss: 1.8907 (1.9220)  tag_loss: 159.7148 (157.9769)  time: 1.4329 (1.6551)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4281 (1.6501)  save_time: 73.3883 (73.3883)  lr: 0.000087  max mem: 26307
2022-03-16 08:47:59,977.977 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-16 08:47:59,977.977 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 166.46978759765625
2022-03-16 08:47:59,977.977 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.0237956529253
2022-03-16 08:48:05,633.633 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01659729890525341
2022-03-16 08:48:05,633.633 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:48:05,634.634 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'person', '[MASK]', 'a', 'motorcycle', 'coming', 'up', 'to', 'a', 'stop', 'sign', '[MASK]', 'the', 'street', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:48:05,649.649 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'road', 'tire', 'pole', 'tree', 'car', 'street', 'motorcycle', 'light', 'line', 'man', '[UNK]', 'helmet', 'wire', 'building', 'person', 'traffic', 'sidewalk', 'sign', 'curb', 'window', 'bike', 'wall', 'jacket', 'wheel', 'shadow', 'bush', 'lot', 'intersection', 'power', 'shirt', 'roof', 'jean', 'truck', 'house', 'suv', 'windshield', 'van', 'mirror', 'parking', 'fence', 'license', 'plate', 'grass', 'hat', 'tail', 'arrow', 'bus', 'stop', 'median']
2022-03-16 08:48:21,668.668 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'line', 'road', 'power', 'street', 'light', 'car', 'stop', 'person', 'tree', 'sign', 'sky', 'truck', 'shadow', 'grass', 'bush', 'pole', 'jacket', 'motorcycle', 'tire']
2022-03-16 08:50:45,198.198 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:18:17  iter: 8900  speed: 309.2 images/sec  total_norm: 129.5278 (134.4210)  loss: 159.9204 (161.6548)  masked_loss: 1.8063 (1.8724)  tag_loss: 157.4885 (159.7824)  time: 1.4322 (1.6559)  data: 0.0001 (0.0005)  to_device: 0.0049 (0.0047)  time_gpu: 1.4270 (1.6506)  save_time: 73.3883 (73.3883)  lr: 0.000087  max mem: 26307
2022-03-16 08:50:45,558.558 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-16 08:50:45,558.558 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 172.56634521484375
2022-03-16 08:50:45,558.558 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.04009874131944
2022-03-16 08:50:51,254.254 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016634687781333923
2022-03-16 08:50:51,254.254 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:50:51,254.254 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'large', 'elephant', 'walking', 'in', 'the', '[MASK]', 'alone', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:50:51,270.270 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'ear', 'trunk', 'elephant', 'ground', 'head', 'leg', 'eye', 'pole', 'foot', 'fence', 'tail', 'sand', 'mouth', 'dirt', 'back', '[UNK]', 'wall', 'building', 'enclosure', 'rock', 'zoo', 'person', 'shadow', 'road', 'man', 'background', 'bush', 'water', 'face', 'structure', 'shirt', 'couple', 'sky', 'hair', 'post', 'grass', 'leaf', 'baby', 'roof', 'stick', 'woman', 'forest', 'light', 'toe', 'sign', 'hill', 'short', 'next', 'pen']
2022-03-16 08:51:07,241.241 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'large', 'ground', 'eye', 'foot', 'tree', 'walking', 'leg', 'ear', 'sand', 'grass', 'pole', 'dirt', 'trunk', 'fence', 'elephant']
2022-03-16 08:53:30,811.811 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:15:38  iter: 9000  speed: 309.2 images/sec  total_norm: 126.0652 (130.1862)  loss: 159.0853 (160.9368)  masked_loss: 1.8702 (1.9007)  tag_loss: 157.0059 (159.0362)  time: 1.4334 (1.6561)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4284 (1.6509)  save_time: 73.3883 (73.3883)  lr: 0.000086  max mem: 26307
2022-03-16 08:53:31,171.171 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4545454680919647
2022-03-16 08:53:31,172.172 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 166.279541015625
2022-03-16 08:53:31,172.172 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.02513851962247
2022-03-16 08:53:36,961.961 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01662563905119896
2022-03-16 08:53:36,961.961 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:53:36,962.962 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'is', 'getting', 'his', 'hair', '[MASK]', 'by', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:53:36,977.977 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'head', 'shirt', 'man', 'hair', 'face', 'nose', '[UNK]', 'ear', 'arm', 'person', 'wall', 'hat', 'eye', 'knife', 'building', 'woman', 'window', 'mouth', 'handle', 'sign', 'table', 'plate', 'bracelet', 'collar', 'food', 'door', 'pole', 'glasses', 'bag', 'container', 'watch', 'mustache', 'scissors', 'dress', 'ground', 'boy', 'cap', 'tree', 'chair', 'fork', 'bowl', 'sky', 'jacket', 'button', 'ceiling', 'bus', 'paper', 'pan', 'umbrella']
2022-03-16 08:53:52,926.926 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'face', 'light', 'hair', 'person', 'table', 'wall', 'cut', 'chair', 'window', 'box', 'sign', 'shirt', 'ear', 'bowl', 'clock', 'mirror', 'knife', 'paint', 'cloth', 'pipe', 'beard', 'towel', 'robe', 'poster', 'barber', 'apron']
2022-03-16 08:56:16,340.340 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:12:59  iter: 9100  speed: 309.3 images/sec  total_norm: 130.0455 (136.3868)  loss: 159.1187 (160.4913)  masked_loss: 1.8685 (1.8612)  tag_loss: 157.7375 (158.6301)  time: 1.4323 (1.6553)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4272 (1.6501)  save_time: 73.3883 (73.3883)  lr: 0.000086  max mem: 26307
2022-03-16 08:56:16,703.703 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-16 08:56:16,703.703 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 130.28884887695312
2022-03-16 08:56:16,703.703 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.05082578244416
2022-03-16 08:56:22,474.474 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01661892607808113
2022-03-16 08:56:22,474.474 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:56:22,474.474 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', 'array', '[MASK]', 'past', '##ries', 'next', '[MASK]', 'five', 'boxes', 'lined', 'up', 'next', 'to', '[MASK]', 'other', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:56:22,489.489 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'box', 'food', 'sign', 'table', 'plate', 'pastry', 'meat', 'hole', 'different', 'wall', 'design', 'potato', 'writing', 'pile', 'bread', 'paper', 'sandwich', 'book', 'top', 'variety', 'chicken', 'various', 'other', 'next', 'container', 'dessert', 'bag', 'mushroom', 'light', 'letter', 'bunch', 'sugar', 'large', 'menu', 'cookie', 'french', 'label', 'close', 'picture', 'full', 'hamburger', 'white', 'store', 'many', 'dog', 'tray', 'logo', 'napkin', 'shelf']
2022-03-16 08:56:38,501.501 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'other', 'next', 'food', 'box', 'plate', 'chicken', 'array', 'pastry']
2022-03-16 08:59:01,970.970 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:10:20  iter: 9200  speed: 309.1 images/sec  total_norm: 125.5824 (130.6383)  loss: 157.1022 (158.2484)  masked_loss: 1.9556 (1.9489)  tag_loss: 154.9592 (156.2994)  time: 1.4329 (1.6563)  data: 0.0002 (0.0002)  to_device: 0.0049 (0.0047)  time_gpu: 1.4278 (1.6514)  save_time: 73.3883 (73.3883)  lr: 0.000086  max mem: 26307
2022-03-16 08:59:02,331.331 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 08:59:02,331.331 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 130.73724365234375
2022-03-16 08:59:02,331.331 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.13154889178533
2022-03-16 08:59:08,168.168 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0166626013815403
2022-03-16 08:59:08,169.169 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 08:59:08,169.169 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', 'rhino', '##s', '[MASK]', 'in', 'a', 'field', 'near', 'a', 'zebra', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 08:59:08,185.185 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['ground', 'shadow', 'leg', 'rock', 'zebra', 'tree', 'tail', 'zoo', 'wall', 'ear', '[UNK]', 'head', 'trunk', 'enclosure', 'animal', 'log', 'mane', 'boulder', 'dirt', 'nose', 'branch', 'stripe', 'elephant', 'mouth', 'horn', 'sand', 'eye', 'neck', 'stick', 'pen', 'back', 'fence', 'other', 'area', 'wood', 'hole', 'shade', 'next', 'grass', 'habitat', 'group', 'foot', 'face', 'pole', 'baby', 'sky', 'pig', 'water', 'body', 'couple']
2022-03-16 08:59:24,306.306 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'field', 'ground', 'rock', 'wall', 'standing', 'couple', 'tree', 'wood', 'branch', 'animal', 'leg', 'shadow', 'tail', 'dirt', 'log', 'zoo', 'enclosure', 'stripe', 'mane', 'zebra']
2022-03-16 09:01:47,806.806 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:07:43  iter: 9300  speed: 308.7 images/sec  total_norm: 128.4512 (131.5962)  loss: 158.2369 (159.9482)  masked_loss: 1.8774 (1.8981)  tag_loss: 155.9892 (158.0501)  time: 1.4331 (1.6584)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4279 (1.6533)  save_time: 73.3883 (73.3883)  lr: 0.000086  max mem: 26307
2022-03-16 09:01:48,166.166 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-16 09:01:48,166.166 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 170.9694366455078
2022-03-16 09:01:48,167.167 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.14512333971389
2022-03-16 09:01:54,041.041 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01666449010372162
2022-03-16 09:01:54,042.042 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:01:54,042.042 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'computer', '[MASK]', 'a', 'keyboard', 'sitting', '[MASK]', 'the', 'desk', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:01:54,058.058 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['monitor', 'computer', 'desk', 'keyboard', 'box', 'screen', 'mouse', 'stand', 'base', 'table', 'wall', 'cord', '[UNK]', 'key', 'speaker', 'logo', 'pad', 'office', 'window', 'button', 'wire', 'paper', 'book', 'lamp', 'phone', 'pen', 'light', 'drawer', 'desktop', 'laptop', 'container', 'bottle', 'cup', 'tower', 'shelf', 'television', 'top', 'handle', 'room', 'picture', 'cabinet', 'white', 'next', 'chair', 'plug', 'board', 'floor', 'cap', 'lid', 'telephone']
2022-03-16 09:02:10,059.059 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'player', 'wall', 'base', 'stand', 'computer', 'box', 'tower', 'screen', 'desk', 'speaker', 'remote', 'mouse', 'monitor', 'logo', 'keyboard', 'cord', 'plug']
2022-03-16 09:04:33,608.608 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:05:05  iter: 9400  speed: 308.8 images/sec  total_norm: 125.5806 (129.5521)  loss: 159.8548 (160.1774)  masked_loss: 1.8836 (1.9076)  tag_loss: 157.8070 (158.2698)  time: 1.4329 (1.6580)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4277 (1.6529)  save_time: 73.3883 (73.3883)  lr: 0.000086  max mem: 26307
2022-03-16 09:04:33,969.969 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5142857432365417
2022-03-16 09:04:33,970.970 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 149.71566772460938
2022-03-16 09:04:33,970.970 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.15443621183697
2022-03-16 09:04:39,934.934 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016728058457374573
2022-03-16 09:04:39,934.934 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:04:39,934.934 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'bathroom', 'is', 'all', 'white', 'and', 'has', 'no', 'towels', '.', 'indirect', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:04:39,949.949 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['mirror', 'wall', 'bathroom', 'sink', '[UNK]', 'shelf', 'curtain', 'toilet', 'cabinet', 'tile', 'window', 'outlet', 'knob', 'handle', 'floor', 'light', 'seat', 'lid', 'pipe', 'white', 'drain', 'door', 'tank', 'reflection', 'shower', 'ceiling', 'rod', 'towel', 'camera', 'bag', 'woman', 'soap', 'bottle', 'person', 'hair', 'head', 'paper', 'holder', 'rack', 'man', 'small', 'ring', 'drawer', 'hole', 'picture', 'can', 'tub', 'dish', 'frame', 'glass']
2022-03-16 09:04:55,933.933 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'white', 'woman', 'person', 'wall', 'window', 'frame', 'handle', 'cabinet', 'mirror', 'bathroom', 'sink', 'purse', 'reflection', 'towel', 'lamp', 'curtain', 'shelf', 'outlet', 'tile', 'tub', 'rack']
2022-03-16 09:07:19,474.474 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:02:27  iter: 9500  speed: 308.7 images/sec  total_norm: 126.1663 (127.4436)  loss: 155.6365 (157.4723)  masked_loss: 1.8522 (1.8648)  tag_loss: 153.8805 (155.6075)  time: 1.4342 (1.6586)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4295 (1.6536)  save_time: 73.3883 (73.3883)  lr: 0.000086  max mem: 26307
2022-03-16 09:07:19,835.835 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-16 09:07:19,835.835 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 165.87757873535156
2022-03-16 09:07:19,835.835 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.18995300928752
2022-03-16 09:07:25,779.779 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016732526943087578
2022-03-16 09:07:25,779.779 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:07:25,780.780 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'riding', 'a', '[MASK]', '##board', 'across', 'a', 'cement', 'ramp', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:07:25,795.795 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['person', 'man', '[UNK]', 'jean', 'arm', 'leg', 'shirt', 'ground', 'hand', 'head', 'hat', 'short', 'hair', 'background', 'foot', 'beach', 'sky', 'sand', 'shoe', 'face', 'water', 'board', 'cap', 'shadow', 'boy', 'wheel', 'knee', 'woman', 'logo', 'grass', 'girl', 'ear', 'building', 'eye', 'top', 'ocean', 'mouth', 'wall', 'back', 'elbow', 'tree', 'white', 'road', 'young', 'nose', 'sleeve', 'wave', 'picture', 'line', 'fence']
2022-03-16 09:07:41,813.813 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'ground', 'foot', 'sky', 'jean', 'leg', 'wheel', 'hat', 'knee', 'shoe', 'cement', 'ramp', 'sock']
2022-03-16 09:10:05,252.252 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:59:49  iter: 9600  speed: 308.8 images/sec  total_norm: 126.5201 (130.6445)  loss: 157.8926 (157.8934)  masked_loss: 1.8769 (1.8979)  tag_loss: 156.0144 (155.9954)  time: 1.4320 (1.6578)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4274 (1.6529)  save_time: 73.3883 (73.3883)  lr: 0.000086  max mem: 26307
2022-03-16 09:10:05,612.612 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.38235294818878174
2022-03-16 09:10:05,612.612 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 151.73374938964844
2022-03-16 09:10:05,612.612 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.20913963711139
2022-03-16 09:10:11,634.634 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016805104911327362
2022-03-16 09:10:11,634.634 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:10:11,635.635 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'traffic', 'and', 'street', 'signs', '[MASK]', 'a', 'wooden', 'pole', '##ophone', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:10:11,650.650 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'sign', 'letter', 'sky', 'stop', 'pole', 'street', 'road', '[UNK]', 'building', 'post', 'trunk', 'car', 'light', 'window', 'grass', 'bush', 'ground', 'leaf', 'bolt', 'background', 'sidewalk', 'red', 'roof', 'house', 'branch', 'fence', 'intersection', 'line', 'arrow', 'person', 'wall', 'curb', 'front', 'man', 'traffic', 'graffiti', 'shadow', 'word', 'truck', 'shirt', 'next', 'tire', 'top', 'power', 'bracket', 'screw', 'corner', 'wire', 'side']
2022-03-16 09:10:27,627.627 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'number', 'street', 'stop', 'tree', 'letter', 'border', 'sign', 'sky', 'wooden', 'pole', 'leaf', 'rope', 'strap']
03-16 09:12:28.301 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 09:12:28.301 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 09:12:29.837 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 97}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 96}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 09:12:51,216.216 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:57:11  iter: 9700  speed: 308.5 images/sec  total_norm: 127.6420 (129.0932)  loss: 157.9305 (158.7188)  masked_loss: 1.8871 (1.9309)  tag_loss: 156.2153 (156.7880)  time: 1.4332 (1.6597)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4280 (1.6546)  save_time: 73.3883 (73.3883)  lr: 0.000085  max mem: 26307
2022-03-16 09:12:51,582.582 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-16 09:12:51,582.582 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 150.11776733398438
2022-03-16 09:12:51,582.582 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.2287683292311
2022-03-16 09:12:57,675.675 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01684863306581974
2022-03-16 09:12:57,676.676 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:12:57,676.676 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'that', 'is', 'jumping', '[MASK]', 'the', 'air', 'with', 'a', 'skate', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:12:57,692.692 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', '[UNK]', 'pole', 'shirt', 'building', 'man', 'hat', 'cloud', 'wheel', 'boy', 'ground', 'hand', 'shadow', 'shoe', 'person', 'head', 'light', 'street', 'tree', 'cap', 'arm', 'car', 'wire', 'line', 'jean', 'sidewalk', 'ramp', 'road', 'window', 'fence', 'sign', 'grass', 'wall', 'park', 'skate', 'board', 'leg', 'short', 'curb', 'roof', 'air', 'trick', 'hair', 'bush', 'tire', 'foot', 'railing', 'young', 'bench', 'power']
2022-03-16 09:13:13,777.777 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'air', 'building', 'road', 'street', 'car', 'ground', 'arm', 'van', 'sign', 'sky', 'shirt', 'background', 'shadow', 'wheel', 'brick', 'hat', 'cloud', 'pole', 'wire', 'trick', 'barrel', 'shoe', 'sidewalk']
2022-03-16 09:15:37,472.472 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:54:35  iter: 9800  speed: 308.0 images/sec  total_norm: 127.4463 (128.1307)  loss: 151.4152 (153.5290)  masked_loss: 1.7324 (1.8380)  tag_loss: 149.8911 (151.6910)  time: 1.4351 (1.6625)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4302 (1.6575)  save_time: 73.3883 (73.3883)  lr: 0.000085  max mem: 26307
2022-03-16 09:15:37,833.833 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-16 09:15:37,833.833 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 129.14585876464844
2022-03-16 09:15:37,833.833 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.28380638662011
2022-03-16 09:15:43,937.937 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016833890229463577
2022-03-16 09:15:43,937.937 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:15:43,938.938 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'tennis', 'player', '[MASK]', 'swinging', 'at', 'a', 'volley', 'during', 'a', 'match', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:15:43,953.953 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'shoe', 'man', 'short', '[UNK]', 'sock', 'tennis', 'court', 'leg', 'hand', 'wall', 'ground', 'hat', 'ball', 'head', 'hair', 'person', 'player', 'cap', 'boy', 'logo', 'line', 'sign', 'letter', 'fence', 'uniform', 'banner', 'arm', 'shadow', 'dirt', 'game', 'chair', 'stripe', 'woman', 'net', 'outfit', 'window', 'number', 'pole', 'handle', 'tree', 'playing', 'jacket', 'background', 'top', 'match', 'blue', 'group', 'plant', 'stand']
2022-03-16 09:16:00,018.018 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'player', 'court', 'short', 'ground', 'hair', 'match', 'wall', 'arm', 'ball', 'shirt', 'leg', 'tennis', 'shadow', 'hat', 'shoe', 'outfit', 'volley', 'sock']
2022-03-16 09:18:23,347.347 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:51:57  iter: 9900  speed: 308.7 images/sec  total_norm: 127.9107 (130.6507)  loss: 159.0418 (159.7381)  masked_loss: 1.8340 (1.8654)  tag_loss: 157.0418 (157.8728)  time: 1.4322 (1.6588)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4270 (1.6537)  save_time: 73.3883 (73.3883)  lr: 0.000085  max mem: 26307
2022-03-16 09:18:23,707.707 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 09:18:23,708.708 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 159.0889892578125
2022-03-16 09:18:23,708.708 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.26748870849609
2022-03-16 09:18:29,858.858 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016970515251159668
2022-03-16 09:18:29,859.859 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:18:29,859.859 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'people', 'in', 'the', '[MASK]', 'room', 'playing', 'the', 'nintendo', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:18:29,874.874 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'wall', 'shirt', 'television', 'hand', 'man', 'jean', 'boy', 'game', '[UNK]', 'controller', 'picture', 'floor', 'stand', 'arm', 'remote', 'head', 'ear', 'cord', 'video', 'logo', 'wii', 'ceiling', 'room', 'strap', 'glasses', 'person', 'bracelet', 'screen', 'face', 'door', 'table', 'microphone', 'design', 'leg', 'speaker', 'light', 'girl', 'book', 'carpet', 'tv', 'paper', 'young', 'rug', 'switch', 'woman', 'dvd', 'button', 'short', 'shelf']
2022-03-16 09:18:45,845.845 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'room', 'book', 'player', 'living', 'television', 'hair', 'tv', 'wall', 'arm', 'boy', 'stand', 'chair', 'box', 'jean', 'shirt', 'picture', 'finger', 'dvd', 'logo', 'shelf', 'controller', 'wii']
2022-03-16 09:21:09,372.372 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:49:19  iter: 10000  speed: 308.4 images/sec  total_norm: 127.0210 (130.2868)  loss: 156.4432 (157.7097)  masked_loss: 1.8474 (1.8374)  tag_loss: 154.3111 (155.8722)  time: 1.4320 (1.6602)  data: 0.0001 (0.0005)  to_device: 0.0050 (0.0049)  time_gpu: 1.4269 (1.6548)  save_time: 73.3883 (73.3883)  lr: 0.000085  max mem: 26307
2022-03-16 09:21:09,374.374 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt
2022-03-16 09:21:18,627.627 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4545454680919647
2022-03-16 09:21:18,627.627 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.15859985351562
2022-03-16 09:21:18,627.627 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.28933020865563
2022-03-16 09:21:24,824.824 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.016981514170765877
2022-03-16 09:21:24,824.824 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:21:24,825.825 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'man', 'standing', 'in', 'front', 'of', 'a', '[MASK]', 'stop', 'sign', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:21:24,839.839 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['nose', 'ear', 'eye', 'man', 'hair', 'face', 'shirt', 'head', 'mouth', 'collar', 'letter', 'sign', 'pole', 'building', 'wall', 'sidewalk', 'chin', '[UNK]', 'brick', 'front', 'street', 'plant', 'stop', 'pillar', 'door', 'button', 'jacket', 'neck', 'red', 'word', 'post', 'next', 'ground', 'column', 'line', 'background', 'arm', 'window', 'person', 'handle', 'circle', 'leaf', 'white', 'road', 'car', 'lip', 'hand', 'object', 'sleeve', 'curb']
2022-03-16 09:21:40,663.663 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'building', 'front', 'round', 'hair', 'stop', 'mouth', 'wall', 'eye', 'neck', 'letter', 'sign', 'shirt', 'nose', 'ear', 'handle', 'collar']
2022-03-16 09:24:03,637.637 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:47:27  iter: 10100  speed: 293.8 images/sec  total_norm: 126.9644 (129.0071)  loss: 151.6999 (157.2151)  masked_loss: 1.8096 (1.8611)  tag_loss: 150.3896 (155.3540)  time: 1.4336 (1.7426)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0048)  time_gpu: 1.4282 (1.6488)  save_time: 8.8805 (41.1344)  lr: 0.000085  max mem: 26307
2022-03-16 09:24:03,997.997 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 09:24:03,997.997 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 163.59783935546875
2022-03-16 09:24:03,997.997 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.30462855918735
2022-03-16 09:24:10,207.207 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01699250005185604
2022-03-16 09:24:10,207.207 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:24:10,208.208 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'superiors', '[MASK]', 'has', 'nothing', 'on', 'the', 'counter', 'top', 'except', 'an', 'ipod', 'holder', '/', 'speaker', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:24:10,223.223 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'kitchen', 'light', '[UNK]', 'ceiling', 'floor', 'cabinet', 'stove', 'picture', 'oven', 'hood', 'refrigerator', 'table', 'door', 'vent', 'microwave', 'sink', 'handle', 'room', 'drawer', 'speaker', 'outlet', 'bottle', 'fan', 'top', 'shelf', 'chair', 'vase', 'coffee', 'clock', 'counter', 'window', 'stool', 'stand', 'maker', 'phone', 'large', 'television', 'kettle', 'plate', 'white', 'knob', 'island', 'tile', 'towel', 'bowl', 'bar', 'frame', 'holder', 'box']
2022-03-16 09:24:26,245.245 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'top', 'door', 'light', 'nothing', 'floor', 'wall', 'window', 'kitchen', 'picture', 'scale', 'coffee', 'counter', 'handle', 'plate', 'cabinet', 'ceiling', 'maker', 'tray', 'drawer', 'outlet', 'tile', 'stove', 'knob', 'oven', 'microwave', 'vent', 'kettle', 'spacious', 'ipod']
2022-03-16 09:26:49,748.748 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:44:49  iter: 10200  speed: 308.2 images/sec  total_norm: 128.0937 (130.6670)  loss: 158.9673 (159.5079)  masked_loss: 1.8799 (1.9271)  tag_loss: 157.1065 (157.5808)  time: 1.4333 (1.6611)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4284 (1.6561)  save_time: 8.8805 (41.1344)  lr: 0.000085  max mem: 26307
2022-03-16 09:26:50,113.113 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-16 09:26:50,113.113 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 138.9566650390625
2022-03-16 09:26:50,113.113 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.34250263103003
2022-03-16 09:26:56,390.390 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017061792314052582
2022-03-16 09:26:56,391.391 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:26:56,391.391 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'blue', 'two', '[MASK]', '[MASK]', 'on', 'a', 'highway', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:26:56,406.406 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'bus', 'building', 'sign', '[UNK]', 'tire', 'road', 'sky', 'grill', 'fence', 'street', 'plate', 'license', 'wheel', 'double', 'pole', 'sidewalk', 'light', 'windshield', 'line', 'decker', 'person', 'front', 'car', 'wall', 'mirror', 'top', 'man', 'stop', 'letter', 'shirt', 'store', 'deck', 'curb', 'railing', 'roof', 'advertisement', 'door', 'woman', 'city', 'number', 'post', 'tree', 'brick', 'driver', 'gate', 'jacket', 'truck', 'coat', 'box']
2022-03-16 09:27:12,467.467 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'number', 'line', 'building', 'top', 'road', 'street', 'light', 'woman', 'story', 'car', 'blue', 'person', 'bridge', 'highway', 'window', 'box', 'letter', 'sign', 'sky', 'shirt', 'bus', 'traffic', 'truck', 'wheel', 'mirror', 'pole', 'fence', 'sidewalk', 'tire', 'advertisement', 'grill', 'windshield']
2022-03-16 09:29:36,056.056 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:42:12  iter: 10300  speed: 307.9 images/sec  total_norm: 128.6246 (130.2854)  loss: 158.3511 (158.1128)  masked_loss: 1.7148 (1.7843)  tag_loss: 156.4207 (156.3285)  time: 1.4336 (1.6631)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4287 (1.6580)  save_time: 8.8805 (41.1344)  lr: 0.000084  max mem: 26307
2022-03-16 09:29:36,418.418 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5277777910232544
2022-03-16 09:29:36,418.418 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 164.6593017578125
2022-03-16 09:29:36,418.418 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.33844126187839
2022-03-16 09:29:42,732.732 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017056787386536598
2022-03-16 09:29:42,732.732 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:29:42,733.733 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'brown', 'bear', '[MASK]', 'on', 'quincy', 'of', 'cement', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:29:42,748.748 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['nose', 'bear', 'head', 'eye', 'paw', 'ear', 'claw', 'rock', 'ground', 'shadow', 'mouth', 'face', 'water', 'leg', 'wall', 'arm', 'moss', 'brown', 'grass', 'snout', 'log', 'foot', 'stone', '[UNK]', 'nail', 'neck', 'background', 'zoo', 'ledge', 'dirt', 'reflection', 'muzzle', 'polar', 'large', 'tree', 'puddle', 'body', 'wood', 'animal', 'pond', 'fur', 'enclosure', 'top', 'trunk', 'step', 'leaf', 'tail', 'area', 'snow', 'white']
2022-03-16 09:29:58,733.733 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'face', 'top', 'ground', 'rock', 'brown', 'eye', 'neck', 'foot', 'leg', 'nose', 'ear', 'bear', 'shadow', 'trunk', 'log', 'moss', 'cement', 'claw', 'paw']
2022-03-16 09:32:22,324.324 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:39:35  iter: 10400  speed: 307.9 images/sec  total_norm: 127.1969 (139.0600)  loss: 159.6070 (161.2413)  masked_loss: 1.8556 (1.9298)  tag_loss: 157.9227 (159.3115)  time: 1.4347 (1.6627)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4296 (1.6575)  save_time: 8.8805 (41.1344)  lr: 0.000084  max mem: 26307
2022-03-16 09:32:22,684.684 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-16 09:32:22,684.684 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 164.86825561523438
2022-03-16 09:32:22,684.684 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.32633434477306
2022-03-16 09:32:29,033.033 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017041685059666634
2022-03-16 09:32:29,034.034 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:32:29,034.034 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'person', '[MASK]', 'a', 'child', 'standing', 'outside', 'holding', 'dogs', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:32:29,049.049 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'hand', 'fence', 'hair', 'head', 'tree', 'boy', '[UNK]', 'logo', 'man', 'face', 'sky', 'building', 'ground', 'arm', 'window', 'water', 'ear', 'bush', 'pole', 'short', 'track', 'floor', 'person', 'roof', 'leg', 'wall', 'hat', 'nose', 'glasses', 'eye', 'mouth', 'post', 'sidewalk', 'railing', 'train', 'shoe', 'table', 'jean', 'grass', 'light', 'background', 'design', 'camera', 'trunk', 'sign', 'child', 'umbrella', 'cap', 'strap']
2022-03-16 09:32:45,063.063 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'face', 'woman', 'short', 'ground', 'hair', 'girl', 'outside', 'person', 'floor', 'child', 'chair', 'foot', 'tree', 'shirt', 'dog', 'leg', 'bag', 'ear', 'bush', 'hat', 'cap', 'glasses', 'fence', 'collar', 'reflection', 'shoe', 'sidewalk', 'tile', 'sweater', 'sunglasses', 'chimney', 'cushion', 'leash']
2022-03-16 09:35:08,496.496 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:36:56  iter: 10500  speed: 308.1 images/sec  total_norm: 126.7063 (129.0044)  loss: 155.6191 (157.0873)  masked_loss: 1.7329 (1.7761)  tag_loss: 153.5807 (155.3112)  time: 1.4328 (1.6617)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4278 (1.6567)  save_time: 8.8805 (41.1344)  lr: 0.000084  max mem: 26307
2022-03-16 09:35:08,857.857 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-16 09:35:08,858.858 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 157.58441162109375
2022-03-16 09:35:08,858.858 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.31941950096274
2022-03-16 09:35:15,241.241 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017158575356006622
2022-03-16 09:35:15,242.242 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:35:15,242.242 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'large', 'gray', 'teddy', 'bear', 'laying', 'on', 'activation', '##tton', 'a', 'blanket', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:35:15,257.257 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['nose', 'bear', 'head', 'ear', 'mouth', 'teddy', 'eye', 'bow', 'face', 'ribbon', 'arm', 'pillow', 'blanket', 'leg', 'foot', 'muzzle', 'bed', 'paw', 'stuffed', 'neck', 'cloth', 'white', '[UNK]', 'next', 'brown', 'couch', 'scarf', 'tag', 'animal', 'other', 'sheet', 'hair', 'hand', 'fur', 'flower', 'shirt', 'wall', 'tie', 'red', 'pad', 'tail', 'chair', 'floor', 'line', 'collar', 'top', 'dog', 'laying', 'small', 'letter']
2022-03-16 09:35:31,260.260 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'face', 'large', 'top', 'control', 'mouth', 'arm', 'eye', 'foot', 'gray', 'leg', 'nose', 'ear', 'bear', 'cloth', 'blanket', 'pillow', 'ribbon', 'teddy', 'stripe', 'scarf']
2022-03-16 09:37:54,695.695 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:34:18  iter: 10600  speed: 308.1 images/sec  total_norm: 126.0428 (127.9087)  loss: 157.5936 (158.6166)  masked_loss: 1.8473 (1.8692)  tag_loss: 155.2387 (156.7474)  time: 1.4327 (1.6620)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4277 (1.6570)  save_time: 8.8805 (41.1344)  lr: 0.000084  max mem: 26307
2022-03-16 09:37:55,058.058 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6285714507102966
2022-03-16 09:37:55,058.058 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 143.2294921875
2022-03-16 09:37:55,058.058 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.35984388690129
2022-03-16 09:38:01,513.513 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017201177775859833
2022-03-16 09:38:01,513.513 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:38:01,513.513 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'men', 'at', 'a', 'a', 'white', 'board', 'talking', 'with', 'a', 'samsung', 'sign', '##tees', 'them', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:38:01,529.529 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'suit', 'tie', 'shirt', 'tag', 'hair', '[UNK]', 'jacket', 'hand', 'wall', 'face', 'badge', 'head', 'glasses', 'person', 'name', 'sign', 'paper', 'table', 'floor', 'group', 'ribbon', 'flag', 'woman', 'poster', 'room', 'ear', 'book', 'light', 'chair', 'microphone', 'door', 'shoe', 'screen', 'business', 'beard', 'neck', 'nose', 'desk', 'letter', 'curtain', 'board', 'banner', 'ceiling', 'carpet', 'necklace', 'writing', 'plaque', 'bottle', 'arm']
2022-03-16 09:38:17,546.546 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'name', 'hand', 'face', 'white', 'board', 'hair', 'person', 'floor', 'table', 'wall', 'writing', 'computer', 'letter', 'sign', 'shirt', 'screen', 'nose', 'display', 'suit', 'tie', 'tag', 'button', 'jacket', 'glasses', 'keyboard', 'badge', 'shoe', 'poster']
2022-03-16 09:40:41,205.205 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:31:41  iter: 10700  speed: 307.5 images/sec  total_norm: 127.5492 (128.6291)  loss: 153.5759 (155.3939)  masked_loss: 1.7093 (1.7730)  tag_loss: 152.3310 (153.6210)  time: 1.4340 (1.6651)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4292 (1.6600)  save_time: 8.8805 (41.1344)  lr: 0.000084  max mem: 26307
2022-03-16 09:40:41,568.568 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.40625
2022-03-16 09:40:41,568.568 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 153.51718139648438
2022-03-16 09:40:41,568.568 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.32912543967917
2022-03-16 09:40:48,096.096 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017141008749604225
2022-03-16 09:40:48,096.096 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:40:48,097.097 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', '[MASK]', 'brown', 'bear', 'standing', 'in', 'a', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:40:48,112.112 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'log', 'bear', 'ground', 'head', 'ear', 'water', 'bush', 'nose', 'leg', 'rock', 'mouth', 'eye', 'shadow', 'tree', 'snout', 'tongue', 'flower', 'face', 'plant', 'paw', 'back', 'background', '[UNK]', 'brown', 'river', 'zoo', 'dirt', 'fur', 'pond', 'wall', 'trunk', 'weed', 'boulder', 'large', 'teeth', 'wood', 'neck', 'branch', 'leaf', 'food', 'enclosure', 'fence', 'light', 'sun', 'reflection', 'walking', 'stone', 'black', 'patch']
2022-03-16 09:41:04,088.088 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['back', 'head', 'water', 'large', 'field', 'ground', 'rock', 'mouth', 'brown', 'leg', 'background', 'tongue', 'nose', 'ear', 'bear', 'shadow', 'grass', 'bush', 'flower', 'trunk', 'pond', 'log', 'curb', 'paw']
03-16 09:42:29.937 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 09:42:29.937 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 09:42:30.896 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 09:43:27,663.663 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:29:04  iter: 10800  speed: 307.6 images/sec  total_norm: 133.1298 (138.8519)  loss: 153.4497 (154.2352)  masked_loss: 1.8300 (1.8623)  tag_loss: 151.7598 (152.3729)  time: 1.4340 (1.6646)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4289 (1.6596)  save_time: 8.8805 (41.1344)  lr: 0.000084  max mem: 26307
2022-03-16 09:43:28,024.024 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-16 09:43:28,025.025 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 150.58885192871094
2022-03-16 09:43:28,025.025 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.3430706339145
2022-03-16 09:43:34,601.601 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017158713191747665
2022-03-16 09:43:34,601.601 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:43:34,601.601 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'white', 'refrigerator', 'is', '[MASK]', 'a', 'work', 'space', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:43:34,617.617 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['refrigerator', 'wall', 'handle', 'floor', 'door', 'ground', 'box', '[UNK]', 'building', 'pole', 'room', 'ceiling', 'ladder', 'light', 'kitchen', 'beam', 'window', 'broom', 'cardboard', 'wood', 'board', 'bag', 'paper', 'cabinet', 'wheel', 'shelf', 'chair', 'sign', 'table', 'crate', 'next', 'stove', 'shadow', 'doorway', 'top', 'brick', 'shovel', 'oven', 'bucket', 'stick', 'view', 'old', 'pipe', 'tool', 'empty', 'cord', 'white', 'open', 'trash', 'picture']
2022-03-16 09:43:50,593.593 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'work', 'room', 'white', 'door', 'ground', 'board', 'space', 'floor', 'wall', 'window', 'box', 'handle', 'ceiling', 'stick', 'pole', 'refrigerator', 'broom']
2022-03-16 09:46:14,064.064 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:26:27  iter: 10900  speed: 307.7 images/sec  total_norm: 128.8097 (130.8143)  loss: 155.8550 (156.6989)  masked_loss: 1.8910 (1.8794)  tag_loss: 154.4810 (154.8195)  time: 1.4332 (1.6640)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4282 (1.6590)  save_time: 8.8805 (41.1344)  lr: 0.000084  max mem: 26307
2022-03-16 09:46:14,426.426 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4285714328289032
2022-03-16 09:46:14,427.427 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 133.0312957763672
2022-03-16 09:46:14,427.427 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.3640462701971
2022-03-16 09:46:21,049.049 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0172384325414896
2022-03-16 09:46:21,049.049 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:46:21,050.050 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'in', 'shirt', 'and', 'tie', '[MASK]', 'at', 'a', 'desk', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:46:21,065.065 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'hand', 'shirt', 'glasses', 'wall', 'tie', 'face', 'hair', 'ear', 'ring', 'head', 'table', 'watch', 'desk', 'base', 'nose', 'chair', 'finger', '[UNK]', 'arm', 'pen', 'collar', 'lamp', 'sleeve', 'wrist', 'computer', 'paper', 'microphone', 'stand', 'laptop', 'cord', 'handle', 'holder', 'glass', 'mouth', 'beard', 'phone', 'mustache', 'office', 'eye', 'woman', 'bottle', 'keyboard', 'notebook', 'tray', 'logo', 'book', 'cup', 'knot', 'wire']
2022-03-16 09:46:37,152.152 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'hair', 'table', 'wall', 'arm', 'base', 'stand', 'chair', 'watch', 'ring', 'shirt', 'finger', 'nose', 'ear', 'desk', 'tie', 'wrist', 'glasses', 'collar', 'scissors', 'mustache']
2022-03-16 09:49:00,685.685 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:23:50  iter: 11000  speed: 307.3 images/sec  total_norm: 126.9760 (130.5846)  loss: 160.7057 (161.1748)  masked_loss: 1.8001 (1.8784)  tag_loss: 158.8291 (159.2964)  time: 1.4342 (1.6662)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4292 (1.6613)  save_time: 8.8805 (41.1344)  lr: 0.000083  max mem: 26307
2022-03-16 09:49:01,044.044 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.46875
2022-03-16 09:49:01,044.044 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 167.17774963378906
2022-03-16 09:49:01,044.044 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.39145983446825
2022-03-16 09:49:07,716.716 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0172822754830122
2022-03-16 09:49:07,716.716 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:49:07,717.717 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'couple', 'of', '[MASK]', 'walking', 'down', 'a', 'dirt', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:49:07,732.732 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['leg', 'cow', 'grass', 'ground', 'head', 'ear', 'tree', '[UNK]', 'road', 'nose', 'path', 'tail', 'bush', 'face', 'eye', 'spot', 'horn', 'leaf', 'sky', 'dirt', 'fence', 'plant', 'herd', 'rock', 'pole', 'shirt', 'building', 'group', 'brown', 'hair', 'mud', 'gravel', 'cattle', 'field', 'light', 'forest', 'tag', 'street', 'man', 'next', 'area', 'animal', 'person', 'foot', 'hat', 'post', 'window', 'other', 'rope', 'pasture']
2022-03-16 09:49:23,784.784 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'road', 'ground', 'couple', 'eye', 'tree', 'spot', 'path', 'leg', 'nose', 'ear', 'grass', 'tail', 'dirt', 'leaf', 'cow']
2022-03-16 09:51:47,250.250 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:21:13  iter: 11100  speed: 307.4 images/sec  total_norm: 125.0513 (126.8078)  loss: 157.1694 (159.3545)  masked_loss: 1.8038 (1.8433)  tag_loss: 155.4464 (157.5112)  time: 1.4338 (1.6656)  data: 0.0001 (0.0005)  to_device: 0.0050 (0.0048)  time_gpu: 1.4286 (1.6602)  save_time: 8.8805 (41.1344)  lr: 0.000083  max mem: 26307
2022-03-16 09:51:47,612.612 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 09:51:47,613.613 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 166.98617553710938
2022-03-16 09:51:47,613.613 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.38234519958496
2022-03-16 09:51:54,290.290 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01734146662056446
2022-03-16 09:51:54,290.290 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:51:54,291.291 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'four', 'small', '[MASK]', 'fly', 'low', 'in', 'the', 'obeyed', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:51:54,306.306 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'tree', 'airplane', 'wing', 'tail', 'propeller', 'cloud', 'field', 'wheel', 'plane', 'grass', 'person', 'gear', 'landing', 'cockpit', 'engine', 'car', '[UNK]', 'man', 'building', 'ground', 'stripe', 'road', 'air', 'pole', 'aircraft', 'bush', 'shirt', 'fence', 'group', 'post', 'small', 'water', 'body', 'sign', 'leg', 'roof', 'nose', 'house', 'kite', 'letter', 'flag', 'tire', 'blue', 'hill', 'cloudy', 'shadow', 'dirt', 'day', 'lot']
2022-03-16 09:52:10,392.392 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['small', 'low', 'wing', 'tree', 'sky', 'flag', 'wheel', 'tail', 'pole', 'airplane', 'cockpit', 'propeller']
2022-03-16 09:54:33,927.927 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:18:36  iter: 11200  speed: 307.2 images/sec  total_norm: 128.4007 (128.9694)  loss: 158.6474 (159.2351)  masked_loss: 1.8398 (1.8541)  tag_loss: 156.5146 (157.3810)  time: 1.4344 (1.6668)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4297 (1.6617)  save_time: 8.8805 (41.1344)  lr: 0.000083  max mem: 26307
2022-03-16 09:54:34,287.287 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6857143044471741
2022-03-16 09:54:34,287.287 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 157.29931640625
2022-03-16 09:54:34,287.287 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.42940487481852
2022-03-16 09:54:41,041.041 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017329899594187737
2022-03-16 09:54:41,041.041 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:54:41,042.042 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'gi', '[MASK]', '##fe', 'and', 'a', 'few', 'zebra', '##s', 'in', 'a', 'sandy', 'area', 'with', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:54:41,057.057 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'head', 'leg', '[UNK]', 'fence', 'neck', 'ground', 'rock', 'tail', 'zebra', 'zoo', 'mane', 'ear', 'enclosure', 'pole', 'dirt', 'shadow', 'grass', 'horn', 'spot', 'trunk', 'bush', 'eye', 'sky', 'hair', 'branch', 'mouth', 'wall', 'log', 'pen', 'group', 'other', 'stripe', 'next', 'area', 'boulder', 'post', 'building', 'animal', 'couple', 'plant', 'palm', 'leaf', 'nose', 'front', 'baby', 'bird', 'feeder', 'adult', 'standing']
2022-03-16 09:54:57,067.067 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'area', 'few', 'ground', 'rock', 'neck', 'tree', 'leg', 'ear', 'shadow', 'tail', 'pole', 'dirt', 'horn', 'sandy', 'fence', 'zoo', 'enclosure', 'stripe', 'mane', 'zebra']
2022-03-16 09:57:20,696.696 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:16:00  iter: 11300  speed: 307.0 images/sec  total_norm: 129.0781 (130.7049)  loss: 153.2269 (154.9657)  masked_loss: 1.8300 (1.9013)  tag_loss: 151.2944 (153.0645)  time: 1.4341 (1.6677)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4291 (1.6626)  save_time: 8.8805 (41.1344)  lr: 0.000083  max mem: 26307
2022-03-16 09:57:21,057.057 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6285714507102966
2022-03-16 09:57:21,058.058 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 160.20086669921875
2022-03-16 09:57:21,058.058 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.43586188868473
2022-03-16 09:57:28,051.051 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01731725223362446
2022-03-16 09:57:28,051.051 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 09:57:28,051.051 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'holding', 'a', 'tennis', '[MASK]', '##et', 'is', 'hitting', 'a', 'ball', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 09:57:28,067.067 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'shoe', '[UNK]', 'ball', 'court', 'man', 'line', 'tennis', 'short', 'leg', 'hand', 'sock', 'head', 'shadow', 'wall', 'player', 'sign', 'letter', 'hair', 'arm', 'logo', 'ground', 'person', 'hat', 'stand', 'banner', 'box', 'knee', 'stripe', 'floor', 'handle', 'advertisement', 'foot', 'band', 'writing', 'sleeve', 'board', 'white', 'face', 'star', 'cap', 'glove', 'spectator', 'net', 'black', 'flag', 'base', 'ear', 'wrist', 'boy']
2022-03-16 09:57:44,116.116 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'court', 'short', 'hair', 'person', 'star', 'wall', 'arm', 'foot', 'box', 'ball', 'letter', 'piano', 'sign', 'shirt', 'handle', 'tennis', 'shadow', 'banner', 'beard', 'shoe', 'sock']
2022-03-16 10:00:07,651.651 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:13:24  iter: 11400  speed: 306.7 images/sec  total_norm: 130.1305 (131.6654)  loss: 157.6465 (157.6752)  masked_loss: 1.7889 (1.8046)  tag_loss: 155.8576 (155.8706)  time: 1.4332 (1.6696)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4281 (1.6645)  save_time: 8.8805 (41.1344)  lr: 0.000083  max mem: 26307
2022-03-16 10:00:08,011.011 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5714285969734192
2022-03-16 10:00:08,012.012 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 162.66915893554688
2022-03-16 10:00:08,012.012 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.44781394626783
2022-03-16 10:00:14,809.809 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01733230985701084
2022-03-16 10:00:14,809.809 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:00:14,810.810 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'herd', 'of', 'cattle', 'grazing', '[MASK]', 'a', 'dry', 'field', 'with', 'snow', 'off', '[MASK]', '1690', 'distance', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:00:14,825.825 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['ground', 'cow', 'field', 'snow', 'tree', 'animal', 'head', 'leg', 'tail', 'calf', 'water', '[UNK]', 'grass', 'trunk', 'shadow', 'puddle', 'ear', 'rock', 'horn', 'group', 'herd', 'dog', 'bird', 'cattle', 'bull', 'sky', 'branch', 'brown', 'wood', 'background', 'stick', 'couple', 'fence', 'stream', 'face', 'dry', 'sheep', 'small', 'number', 'horse', 'pole', 'open', 'bush', 'next', 'deer', 'grazing', 'large', 'paper', 'mud', 'grassy']
2022-03-16 10:00:30,813.813 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'face', 'field', 'ground', 'distance', 'tree', 'leg', 'dry', 'snow', 'object', 'shadow', 'grass', 'tail', 'bull', 'cow', 'herd', 'calf']
2022-03-16 10:02:54,597.597 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:10:48  iter: 11500  speed: 306.7 images/sec  total_norm: 128.4373 (132.4142)  loss: 155.3180 (156.7946)  masked_loss: 1.8088 (1.8925)  tag_loss: 153.7993 (154.9021)  time: 1.4325 (1.6694)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0046)  time_gpu: 1.4275 (1.6646)  save_time: 8.8805 (41.1344)  lr: 0.000083  max mem: 26307
2022-03-16 10:02:54,958.958 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-16 10:02:54,959.959 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.6671142578125
2022-03-16 10:02:54,959.959 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.49082729734224
2022-03-16 10:03:01,848.848 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017354995012283325
2022-03-16 10:03:01,848.848 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:03:01,849.849 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'gi', '[MASK]', '[MASK]', '##s', 'are', 'in', 'a', 'background', 'of', 'tall', 'trees', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:03:01,864.864 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', '[UNK]', 'spot', 'neck', 'sky', 'head', 'ear', 'horn', 'eye', 'mouth', 'grass', 'nose', 'leg', 'bush', 'face', 'mane', 'rock', 'field', 'zoo', 'branch', 'knee', 'shadow', 'body', 'ground', 'next', 'dirt', 'other', 'trunk', 'tail', 'fence', 'tall', 'pole', 'front', 'standing', 'hair', 'leaf', 'hill', 'wall', 'flower', 'tongue', 'area', 'grassy', 'couple', 'large', 'green', 'top', 'animal', 'cloud', 'close', 'baby']
2022-03-16 10:03:17,822.822 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'mouth', 'eye', 'neck', 'tree', 'sky', 'spot', 'tall', 'background', 'nose', 'ear', 'horn']
2022-03-16 10:05:41,515.515 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:08:11  iter: 11600  speed: 306.7 images/sec  total_norm: 129.7579 (131.5477)  loss: 155.3367 (156.3531)  masked_loss: 1.8594 (1.8574)  tag_loss: 153.7790 (154.4957)  time: 1.4328 (1.6692)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4277 (1.6642)  save_time: 8.8805 (41.1344)  lr: 0.000083  max mem: 26307
2022-03-16 10:05:41,876.876 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 10:05:41,877.877 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 147.26007080078125
2022-03-16 10:05:41,877.877 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.52288551004524
2022-03-16 10:05:48,771.771 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01736370287835598
2022-03-16 10:05:48,771.771 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:05:48,771.771 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'women', 'and', 'a', 'man', 'at', 'a', 'table', 'with', 'a', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:05:48,787.787 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'woman', 'glasses', 'suit', 'face', 'glass', 'table', 'man', 'wall', 'necklace', 'shirt', 'microphone', 'window', 'hand', 'cup', 'jacket', 'plate', 'juice', 'tie', 'neck', 'nose', 'head', '[UNK]', 'mug', 'food', 'mouth', 'person', 'name', 'eye', 'spoon', 'chair', 'pitcher', 'bowl', 'napkin', 'cake', 'drink', 'beer', 'paper', 'sign', 'lid', 'fork', 'straw', 'knife', 'handle', 'bread', 'front', 'container', 'logo', 'ear', 'bottle']
2022-03-16 10:06:04,871.871 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'face', 'woman', 'cup', 'hair', 'person', 'table', 'wall', 'food', 'glass', 'eye', 'neck', 'window', 'shirt', 'drink', 'nose', 'suit', 'plate', 'beer', 'tie', 'jacket', 'glasses', 'fork', 'juice', 'lid', 'necklace', 'microphone']
2022-03-16 10:08:28,622.622 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:05:36  iter: 11700  speed: 306.4 images/sec  total_norm: 129.1662 (130.7679)  loss: 154.2498 (154.9690)  masked_loss: 1.7835 (1.8099)  tag_loss: 152.5634 (153.1591)  time: 1.4338 (1.6711)  data: 0.0001 (0.0002)  to_device: 0.0048 (0.0046)  time_gpu: 1.4291 (1.6663)  save_time: 8.8805 (41.1344)  lr: 0.000082  max mem: 26307
2022-03-16 10:08:28,983.983 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.3333333432674408
2022-03-16 10:08:28,983.983 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 153.54290771484375
2022-03-16 10:08:28,984.984 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.52462878469693
2022-03-16 10:08:35,956.956 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01742728240787983
2022-03-16 10:08:35,956.956 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:08:35,956.956 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', '[MASK]', 'playing', 'wii', 'in', '[MASK]', 'living', 'room', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:08:35,973.973 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'man', 'head', 'hand', 'wall', '[UNK]', 'hair', 'lamp', 'pole', 'tent', 'arm', 'face', 'leg', 'bed', 'ear', 'jean', 'pillow', 'couch', 'table', 'shoe', 'foot', 'floor', 'umbrella', 'cord', 'nose', 'glasses', 'light', 'blanket', 'shadow', 'person', 'mouth', 'short', 'room', 'ground', 'boy', 'eye', 'stand', 'shade', 'sheet', 'watch', 'chair', 'woman', 'hat', 'sock', 'top', 'window', 'strap', 'sleeve', 'curtain', 'laptop']
2022-03-16 10:08:52,038.038 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'face', 'room', 'light', 'living', 'hair', 'floor', 'table', 'wall', 'arm', 'window', 'watch', 'shirt', 'ear', 'desk', 'couch', 'pole', 'remote', 'wrist', 'monitor', 'shade', 'keyboard', 'sleeve', 'lamp', 'curtain', 'controller', 'wii', 'vent']
2022-03-16 10:11:15,709.709 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:03:00  iter: 11800  speed: 306.4 images/sec  total_norm: 128.8577 (132.7833)  loss: 156.1726 (156.8815)  masked_loss: 1.7356 (1.8148)  tag_loss: 154.0760 (155.0667)  time: 1.4335 (1.6708)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4283 (1.6658)  save_time: 8.8805 (41.1344)  lr: 0.000082  max mem: 26307
2022-03-16 10:11:16,070.070 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.42424243688583374
2022-03-16 10:11:16,070.070 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 140.9473876953125
2022-03-16 10:11:16,070.070 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.55650919425388
2022-03-16 10:11:23,033.033 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017437878996133804
2022-03-16 10:11:23,034.034 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:11:23,034.034 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', 'older', 'woman', '[MASK]', 'a', 'leopard', 'jacket', 'sitting', 'on', 'a', 'sep', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:11:23,049.049 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'sidewalk', 'shoe', 'ground', 'hair', 'man', 'person', 'bag', 'woman', 'head', 'shirt', 'hand', 'jacket', 'leg', 'wall', 'jean', 'shadow', 'curb', 'street', 'arm', 'skirt', 'building', 'coat', 'hat', 'face', 'window', 'girl', 'dress', 'phone', 'road', 'tree', 'sign', 'pole', 'car', 'wheel', 'glasses', 'umbrella', 'line', 'foot', 'boy', 'sky', 'sock', 'purse', 'bench', 'fence', 'scarf', 'boot', 'light', 'water', 'lady']
2022-03-16 10:11:39,029.029 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'line', 'water', 'street', 'woman', 'ground', 'hair', 'person', 'arm', 'foot', 'jean', 'leg', 'dress', 'bag', 'snow', 'bird', 'shadow', 'coat', 'bottle', 'hat', 'jacket', 'bench', 'glasses', 'purse', 'skirt', 'shoe', 'sidewalk', 'leopard', 'pigeon', 'sunglasses', 'scarf', 'bracelet', 'sock']
03-16 10:12:30.989 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 10:12:30.989 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 10:12:32.084 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 89}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 93}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 10:14:02,733.733 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:00:23  iter: 11900  speed: 306.5 images/sec  total_norm: 127.8818 (133.6626)  loss: 157.7047 (159.3980)  masked_loss: 1.8176 (1.8846)  tag_loss: 155.8960 (157.5134)  time: 1.4314 (1.6702)  data: 0.0001 (0.0001)  to_device: 0.0050 (0.0050)  time_gpu: 1.4261 (1.6651)  save_time: 8.8805 (41.1344)  lr: 0.000082  max mem: 26307
2022-03-16 10:14:03,095.095 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 10:14:03,096.096 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.77464294433594
2022-03-16 10:14:03,096.096 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.57229817708334
2022-03-16 10:14:10,116.116 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017453555017709732
2022-03-16 10:14:10,116.116 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:14:10,116.116 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'street', 'corner', '[MASK]', 'people', 'and', 'a', 'horse', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:14:10,131.131 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'shirt', 'sign', 'building', 'car', '[UNK]', 'street', 'sky', 'tree', 'window', 'roof', 'road', 'pole', 'horse', 'light', 'person', 'sidewalk', 'plate', 'tail', 'house', 'license', 'leg', 'wall', 'curb', 'door', 'hair', 'shoe', 'hat', 'tire', 'shadow', 'line', 'traffic', 'bush', 'van', 'head', 'woman', 'bus', 'cloud', 'jacket', 'truck', 'bag', 'jean', 'windshield', 'city', 'chimney', 'fence', 'stop', 'grass', 'plant', 'wire']
2022-03-16 10:14:26,187.187 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'building', 'road', 'street', 'woman', 'car', 'person', 'child', 'wall', 'van', 'window', 'tree', 'corner', 'horse', 'sign', 'sky', 'jean', 'shirt', 'roof', 'bag', 'kid', 'plate', 'shadow', 'wheel', 'license', 'cloud', 'pole', 'jacket', 'bike', 'bicycle', 'sidewalk', 'tire', 'chimney']
2022-03-16 10:16:49,568.568 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:57:46  iter: 12000  speed: 306.9 images/sec  total_norm: 130.8918 (132.8581)  loss: 158.2243 (160.7906)  masked_loss: 1.7344 (1.8058)  tag_loss: 156.1837 (158.9848)  time: 1.4317 (1.6683)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4266 (1.6632)  save_time: 8.8805 (41.1344)  lr: 0.000082  max mem: 26307
2022-03-16 10:16:49,928.928 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5405405163764954
2022-03-16 10:16:49,929.929 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 192.48577880859375
2022-03-16 10:16:49,929.929 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.55899867538578
2022-03-16 10:16:56,976.976 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017541565001010895
2022-03-16 10:16:56,976.976 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:16:56,977.977 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'truck', 'with', 'a', 'shovel', 'attached', 'to', 'the', '[MASK]', 'of', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:16:56,992.992 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tire', 'sky', 'truck', 'window', 'tree', 'road', 'mirror', 'logo', 'light', 'door', 'vest', 'pole', 'building', 'man', 'sign', '[UNK]', 'plate', 'wheel', 'street', 'windshield', 'ground', 'license', 'handle', 'number', 'snow', 'car', 'writing', 'wire', 'roof', 'front', 'bumper', 'jacket', 'line', 'head', 'rim', 'helmet', 'cab', 'traffic', 'hat', 'fence', 'person', 'safety', 'vehicle', 'house', 'driver', 'worker', 'letter', 'cone', 'grill', 'step']
2022-03-16 10:17:12,911.911 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'number', 'building', 'white', 'door', 'road', 'front', 'light', 'car', 'ground', 'writing', 'window', 'tree', 'sign', 'sky', 'safety', 'truck', 'plate', 'shadow', 'license', 'pole', 'jacket', 'wire', 'logo', 'cab', 'rim', 'tire', 'vest', 'windshield', 'hose', 'shovel']
2022-03-16 10:19:36,570.570 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:55:09  iter: 12100  speed: 306.6 images/sec  total_norm: 130.7278 (134.0936)  loss: 155.0712 (157.4512)  masked_loss: 1.8360 (1.8537)  tag_loss: 153.1510 (155.5975)  time: 1.4336 (1.6700)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4286 (1.6649)  save_time: 8.8805 (41.1344)  lr: 0.000082  max mem: 26307
2022-03-16 10:19:36,932.932 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-16 10:19:36,932.932 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 150.16307067871094
2022-03-16 10:19:36,933.933 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.57968120887631
2022-03-16 10:19:44,047.047 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0176058541983366
2022-03-16 10:19:44,047.047 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:19:44,047.047 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'boy', 'standing', 'near', '[MASK]', 'plate', 'holding', 'a', 'bat', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:19:44,063.063 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'shoe', 'grass', 'dirt', 'bat', 'ground', 'tree', '[UNK]', 'fence', 'person', 'girl', 'woman', 'hair', 'shadow', 'hat', 'baseball', 'short', 'field', 'man', 'hand', 'jean', 'boy', 'head', 'pole', 'game', 'leg', 'child', 'light', 'glove', 'bench', 'ball', 'park', 'young', 'cap', 'arm', 'bottle', 'goal', 'net', 'bag', 'helmet', 'base', 'plate', 'gate', 'home', 'sunglasses', 'catcher', 'sock', 'little', 'kid', 'dress']
2022-03-16 10:20:00,109.109 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'home', 'hand', 'little', 'street', 'light', 'woman', 'short', 'field', 'ground', 'hair', 'girl', 'person', 'arm', 'boy', 'chair', 'tree', 'shirt', 'plate', 'shadow', 'grass', 'bottle', 'hat', 'cap', 'pole', 'bench', 'dirt', 'bat', 'fence', 'helmet', 'shoe', 'glove', 'sock']
2022-03-16 10:22:23,621.621 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:52:32  iter: 12200  speed: 306.5 images/sec  total_norm: 130.1516 (132.5796)  loss: 154.6140 (156.9583)  masked_loss: 1.7344 (1.7705)  tag_loss: 152.6684 (155.1878)  time: 1.4329 (1.6705)  data: 0.0001 (0.0005)  to_device: 0.0050 (0.0049)  time_gpu: 1.4278 (1.6651)  save_time: 8.8805 (41.1344)  lr: 0.000082  max mem: 26307
2022-03-16 10:22:23,983.983 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.42424243688583374
2022-03-16 10:22:23,984.984 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 163.18557739257812
2022-03-16 10:22:23,984.984 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.56819152832031
2022-03-16 10:22:31,116.116 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017646921798586845
2022-03-16 10:22:31,116.116 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:22:31,117.117 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'green', 'and', 'red', 'fire', 'hydra', '##nt', 'in', 'the', 'pretending', 'of', 'a', 'yard', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:22:31,133.133 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'grass', 'fire', 'house', '[UNK]', 'sky', 'roof', 'window', 'trunk', 'chain', 'building', 'top', 'cap', 'park', 'porch', 'bolt', 'ground', 'chimney', 'bush', 'path', 'sidewalk', 'fence', 'green', 'red', 'base', 'pole', 'branch', 'road', 'grassy', 'stair', 'car', 'post', 'leaf', 'yellow', 'person', 'field', 'dirt', 'door', 'flower', 'background', 'rock', 'hill', 'light', 'lawn', 'blue', 'railing', 'sign', 'step', 'pine', 'wall']
2022-03-16 10:22:47,139.139 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'house', 'building', 'top', 'park', 'red', 'car', 'fire', 'green', 'middle', 'chair', 'window', 'tree', 'sky', 'yard', 'background', 'roof', 'chain', 'grass', 'bush', 'cap', 'porch', 'trunk', 'bolt', 'driveway', 'balcony', 'sidewalk', 'plug', 'chimney']
2022-03-16 10:25:10,640.640 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:49:55  iter: 12300  speed: 306.6 images/sec  total_norm: 131.7690 (135.9162)  loss: 156.7102 (155.4787)  masked_loss: 1.7398 (1.7606)  tag_loss: 154.9301 (153.7180)  time: 1.4325 (1.6702)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0047)  time_gpu: 1.4276 (1.6653)  save_time: 8.8805 (41.1344)  lr: 0.000081  max mem: 26307
2022-03-16 10:25:11,001.001 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.48571428656578064
2022-03-16 10:25:11,006.006 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 145.38250732421875
2022-03-16 10:25:11,006.006 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.56977659656155
2022-03-16 10:25:18,219.219 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017645394429564476
2022-03-16 10:25:18,219.219 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:25:18,220.220 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'seated', 'at', 'a', 'chess', '[MASK]', 'with', 'a', '[MASK]', 'on', 'meek', 'other', 'side', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:25:18,235.235 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'ear', 'head', 'cat', 'wall', 'arm', 'eye', 'person', 'leg', 'nose', 'table', 'face', 'girl', 'shadow', 'hair', 'finger', 'woman', '[UNK]', 'picture', 'tail', 'shirt', 'paw', 'mouth', 'man', 'floor', 'elbow', 'thumb', 'boy', 'black', 'child', 'reflection', 'white', 'logo', 'top', 'photo', 'dress', 'front', 'toy', 'seat', 'chair', 'block', 'desk', 'stool', 'sweater', 'young', 'cord', 'ring', 'stand', 'tree', 'small']
2022-03-16 10:25:34,269.269 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'other', 'head', 'man', 'hand', 'side', 'face', 'short', 'hair', 'table', 'wall', 'seat', 'arm', 'eye', 'chair', 'bar', 'block', 'shirt', 'nose', 'ear', 'cat', 'tail', 'cushion']
2022-03-16 10:27:57,870.870 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:47:19  iter: 12400  speed: 306.2 images/sec  total_norm: 128.1555 (133.2681)  loss: 154.2426 (156.2095)  masked_loss: 1.8579 (1.8185)  tag_loss: 152.0333 (154.3910)  time: 1.4340 (1.6723)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4291 (1.6672)  save_time: 8.8805 (41.1344)  lr: 0.000081  max mem: 26307
2022-03-16 10:27:58,230.230 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-16 10:27:58,230.230 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 145.18087768554688
2022-03-16 10:27:58,230.230 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.57770806884766
2022-03-16 10:28:05,521.521 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017643945291638374
2022-03-16 10:28:05,521.521 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:28:05,521.521 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'close', '-', 'up', 'of', 'a', 'mail', '##sl', '##ot', 'with', 'a', 'pair', '[MASK]', 'scissors', 'for', 'the', 'handle', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:28:05,537.537 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['scissors', 'handle', 'door', 'wall', 'blade', 'pair', 'screw', '[UNK]', 'bolt', 'frame', 'hole', 'lock', 'box', 'metal', 'wood', 'piece', 'object', 'wooden', 'building', 'latch', 'old', 'window', 'bracket', 'number', 'top', 'knob', 'sign', 'drawer', 'light', 'table', 'circle', 'hook', 'hand', 'small', 'strap', 'tool', 'large', 'panel', 'buckle', 'close', 'clock', 'mirror', 'picture', 'letter', 'cross', 'board', 'plate', 'design', 'slot', 'ground']
2022-03-16 10:28:21,582.582 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'door', 'wall', 'wood', 'pair', 'handle', 'plate', 'screw', 'scissors']
2022-03-16 10:30:45,050.050 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:44:42  iter: 12500  speed: 306.3 images/sec  total_norm: 129.3221 (135.3853)  loss: 156.5036 (157.8376)  masked_loss: 1.7588 (1.8073)  tag_loss: 154.8768 (156.0303)  time: 1.4334 (1.6719)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4285 (1.6670)  save_time: 8.8805 (41.1344)  lr: 0.000081  max mem: 26307
2022-03-16 10:30:45,411.411 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 10:30:45,411.411 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.59719848632812
2022-03-16 10:30:45,411.411 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.58966785007053
2022-03-16 10:30:52,686.686 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017681479454040527
2022-03-16 10:30:52,687.687 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:30:52,687.687 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', 'extremely', 'busy', 'street', 'with', 'cars', '[MASK]', 'people', 'on', 'the', 'sidewalk', 'holding', 'umbrella', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:30:52,702.702 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['umbrella', 'building', 'street', 'person', 'sky', 'car', 'sign', 'road', '[UNK]', 'pole', 'sidewalk', 'man', 'light', 'bag', 'city', 'jacket', 'line', 'window', 'windshield', 'jean', 'woman', 'shoe', 'coat', 'license', 'rain', 'busy', 'plate', 'store', 'tire', 'hand', 'rainy', 'purse', 'traffic', 'backpack', 'truck', 'billboard', 'curb', 'puddle', 'bus', 'fire', 'taxi', 'hood', 'banner', 'boot', 'flag', 'wall', 'shirt', 'mirror', 'group', 'ground']
2022-03-16 10:31:08,744.744 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'man', 'hand', 'line', 'building', 'road', 'street', 'light', 'car', 'person', 'window', 'sign', 'sky', 'jean', 'traffic', 'bag', 'busy', 'pole', 'jacket', 'sidewalk', 'tire', 'umbrella', 'backpack', 'windshield']
2022-03-16 10:33:32,383.383 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:42:06  iter: 12600  speed: 306.0 images/sec  total_norm: 130.8993 (136.5166)  loss: 154.5221 (154.2137)  masked_loss: 1.8932 (1.8495)  tag_loss: 152.8065 (152.3642)  time: 1.4343 (1.6733)  data: 0.0002 (0.0002)  to_device: 0.0049 (0.0046)  time_gpu: 1.4293 (1.6684)  save_time: 8.8805 (41.1344)  lr: 0.000081  max mem: 26307
2022-03-16 10:33:32,743.743 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.53125
2022-03-16 10:33:32,743.743 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 151.60655212402344
2022-03-16 10:33:32,744.744 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.58588349349856
2022-03-16 10:33:40,084.084 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01774078607559204
2022-03-16 10:33:40,085.085 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:33:40,085.085 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'close', 'up', '[MASK]', 'of', 'a', 'hand', 'on', 'a', 'keyboard', 'by', 'a', 'monitor', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:33:40,100.100 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['light', 'keyboard', 'screen', 'person', 'hand', 'laptop', 'arm', 'computer', 'button', 'key', 'man', 'leg', 'finger', '[UNK]', 'monitor', 'icon', 'table', 'desk', 'logo', 'stripe', 'ball', 'carrot', 'mouse', 'shirt', 'cord', 'pad', 'dark', 'foot', 'woman', 'remote', 'line', 'lap', 'next', 'head', 'thumb', 'television', 'image', 'paper', 'floor', 'background', 'sleeve', 'someone', 'white', 'front', 'close', 'open', 'wire', 'orange', 'picture', 'wall']
2022-03-16 10:33:56,126.126 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['hand', 'light', 'close', 'person', 'arm', 'view', 'computer', 'screen', 'leg', 'monitor', 'keyboard', 'sleeve']
2022-03-16 10:36:19,789.789 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:39:30  iter: 12700  speed: 305.8 images/sec  total_norm: 129.5555 (133.1202)  loss: 153.0539 (154.7452)  masked_loss: 1.7588 (1.7756)  tag_loss: 151.6066 (152.9696)  time: 1.4344 (1.6741)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4291 (1.6690)  save_time: 8.8805 (41.1344)  lr: 0.000081  max mem: 26307
2022-03-16 10:36:20,150.150 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.59375
2022-03-16 10:36:20,150.150 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 133.58700561523438
2022-03-16 10:36:20,150.150 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.60678535699844
2022-03-16 10:36:27,501.501 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017798686400055885
2022-03-16 10:36:27,501.501 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:36:27,501.501 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'brown', 'dog', 'drinking', 'water', 'medicare', 'bowl', 'next', 'to', 'a', 'mirror', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:36:27,517.517 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['dog', 'bowl', 'collar', 'leg', 'mirror', 'floor', 'paw', 'head', 'reflection', 'ear', 'carpet', 'mat', 'glass', '[UNK]', 'tray', 'water', 'door', 'rug', 'table', 'plate', 'tail', 'shadow', 'dish', 'handle', 'pole', 'neck', 'wall', 'frame', 'nose', 'chair', 'food', 'small', 'post', 'cup', 'eye', 'base', 'foot', 'brown', 'bag', 'railing', 'container', 'pillow', 'stand', 'step', 'metal', 'towel', 'paper', 'tile', 'cabinet', 'next']
2022-03-16 10:36:43,464.464 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'next', 'water', 'post', 'floor', 'brown', 'glass', 'eye', 'neck', 'dog', 'leg', 'bag', 'ear', 'bowl', 'frame', 'mirror', 'pole', 'collar', 'reflection', 'carpet', 'tray', 'mat', 'paw']
2022-03-16 10:39:07,069.069 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:36:53  iter: 12800  speed: 306.1 images/sec  total_norm: 129.3338 (131.7621)  loss: 155.1046 (156.8740)  masked_loss: 1.7677 (1.7869)  tag_loss: 152.9436 (155.0871)  time: 1.4334 (1.6727)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4282 (1.6674)  save_time: 8.8805 (41.1344)  lr: 0.000081  max mem: 26307
2022-03-16 10:39:07,430.430 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5142857432365417
2022-03-16 10:39:07,430.430 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 140.60511779785156
2022-03-16 10:39:07,430.430 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.62342787158582
2022-03-16 10:39:14,816.816 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017771968618035316
2022-03-16 10:39:14,816.816 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:39:14,816.816 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', '[MASK]', '[MASK]', 'with', 'a', 'single', 'light', 'and', 'two', 'street', 'signs', 'next', 'to', 'a', '5', 'eleven', 'sign', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:39:14,832.832 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'building', 'light', 'pole', 'street', 'window', 'traffic', 'sign', 'tree', '[UNK]', 'road', 'sidewalk', 'line', 'car', 'city', 'roof', 'wall', 'cloud', 'person', 'lamp', 'clock', 'door', 'letter', 'stop', 'man', 'post', 'intersection', 'tower', 'balcony', 'tall', 'top', 'arrow', 'flag', 'large', 'wire', 'green', 'signal', 'tire', 'tail', 'truck', 'red', 'bus', 'wing', 'railing', 'wheel', 'side', 'antenna', 'van', 'blue', 'hand']
2022-03-16 10:39:30,744.744 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'single', 'building', 'street', 'light', 'fire', 'window', 'sign', 'sky', 'electric', 'escape', 'eleven', 'pole', 'arrow', 'balcony', 'shutter']
2022-03-16 10:41:54,358.358 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:34:16  iter: 12900  speed: 306.1 images/sec  total_norm: 128.6382 (133.1900)  loss: 159.9225 (159.7316)  masked_loss: 1.7283 (1.7726)  tag_loss: 158.1663 (157.9591)  time: 1.4334 (1.6730)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4282 (1.6679)  save_time: 8.8805 (41.1344)  lr: 0.000081  max mem: 26307
2022-03-16 10:41:54,718.718 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6969696879386902
2022-03-16 10:41:54,718.718 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 180.11219787597656
2022-03-16 10:41:54,718.718 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.60770240196815
2022-03-16 10:42:02,240.240 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017777008935809135
2022-03-16 10:42:02,240.240 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:42:02,240.240 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'car', 'is', 'stopped', 'for', '[MASK]', 'red', 'light', 'at', 'an', 'intersection', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:42:02,256.256 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['car', 'sky', 'light', 'tree', 'street', 'sign', 'pole', 'person', 'plate', 'road', 'license', 'traffic', 'window', 'windshield', 'man', '[UNK]', 'building', 'van', 'number', 'sidewalk', 'mirror', 'back', 'shirt', 'reflection', 'bus', 'line', 'tail', 'logo', 'roof', 'tire', 'vehicle', 'curb', 'top', 'shadow', 'motorcycle', 'woman', 'wire', 'hood', 'arrow', 'head', 'truck', 'busy', 'next', 'bush', 'fence', 'trunk', 'intersection', 'bumper', 'city', 'group']
2022-03-16 10:42:18,199.199 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['man', 'road', 'street', 'red', 'light', 'car', 'person', 'window', 'tree', 'sign', 'sky', 'shirt', 'bus', 'truck', 'plate', 'license', 'pole', 'intersection', 'logo', 'reflection', 'taxi', 'sidewalk', 'curb']
03-16 10:42:32.140 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 10:42:32.140 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 10:42:33.152 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 10:44:41,841.841 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:31:40  iter: 13000  speed: 305.7 images/sec  total_norm: 129.0177 (132.9092)  loss: 156.4691 (157.1931)  masked_loss: 1.6816 (1.7890)  tag_loss: 154.4173 (155.4041)  time: 1.4354 (1.6748)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4303 (1.6697)  save_time: 8.8805 (41.1344)  lr: 0.000080  max mem: 26307
2022-03-16 10:44:42,200.200 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6111111044883728
2022-03-16 10:44:42,200.200 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 185.89173889160156
2022-03-16 10:44:42,200.200 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.55554513712876
2022-03-16 10:44:49,717.717 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017778154462575912
2022-03-16 10:44:49,717.717 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:44:49,718.718 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'woman', 'is', 'getting', 'ready', 'to', '[MASK]', 'a', 'tennis', 'ball', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:44:49,733.733 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'tennis', 'wall', 'shoe', 'court', 'letter', 'hand', 'shadow', 'woman', 'leg', 'ball', 'shirt', 'man', 'hat', 'head', 'dress', 'outfit', 'cap', 'arm', 'line', 'ground', 'person', 'microphone', 'player', 'stand', 'camera', 'skirt', 'logo', 'hair', 'sunglasses', 'chair', 'top', 'sock', 'watch', 'writing', 'short', 'girl', 'spectator', 'sign', 'pole', 'net', 'word', 'uniform', 'seat', 'band', 'face', 'female', 'jacket', 'handle', 'tank']
2022-03-16 10:45:05,846.846 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'woman', 'court', 'wall', 'ready', 'ball', 'letter', 'shirt', 'leg', 'dress', 'tennis', 'shadow', 'hat', 'cap', 'logo', 'shoe', 'outfit']
2022-03-16 10:47:29,188.188 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:29:03  iter: 13100  speed: 306.0 images/sec  total_norm: 129.6933 (131.9507)  loss: 152.1934 (156.0730)  masked_loss: 1.7662 (1.8039)  tag_loss: 150.0671 (154.2691)  time: 1.4323 (1.6735)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4273 (1.6684)  save_time: 8.8805 (41.1344)  lr: 0.000080  max mem: 26307
2022-03-16 10:47:29,549.549 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-16 10:47:29,549.549 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 169.70765686035156
2022-03-16 10:47:29,549.549 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.54891875295928
2022-03-16 10:47:37,115.115 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01780332811176777
2022-03-16 10:47:37,115.115 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:47:37,115.115 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', 'elephants', '[MASK]', 'on', 'top', 'of', 'a', 'grass', 'covered', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:47:37,130.130 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['elephant', 'tree', 'grass', 'leg', 'trunk', 'ear', '[UNK]', 'ground', 'shadow', 'head', 'eye', 'sky', 'bush', 'field', 'tail', 'foot', 'dirt', 'rock', 'branch', 'hill', 'path', 'green', 'grassy', 'standing', 'face', 'area', 'large', 'post', 'mountain', 'fence', 'water', 'walking', 'leaf', 'bird', 'next', 'stick', 'small', 'pole', 'forest', 'top', 'baby', 'herd', 'mouth', 'background', 'man', 'lush', 'building', 'group', 'animal', 'wall']
2022-03-16 10:47:53,141.141 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'top', 'field', 'ground', 'rock', 'couple', 'eye', 'tree', 'sky', 'leg', 'ear', 'grass', 'bush', 'dirt', 'trunk', 'elephant']
2022-03-16 10:50:16,821.821 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:26:27  iter: 13200  speed: 305.4 images/sec  total_norm: 131.2391 (133.6553)  loss: 154.5149 (156.5798)  masked_loss: 1.7826 (1.8028)  tag_loss: 152.3601 (154.7770)  time: 1.4341 (1.6764)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4290 (1.6712)  save_time: 8.8805 (41.1344)  lr: 0.000080  max mem: 26307
2022-03-16 10:50:17,182.182 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5135135054588318
2022-03-16 10:50:17,183.183 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.74815368652344
2022-03-16 10:50:17,183.183 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.56220824736401
2022-03-16 10:50:24,759.759 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017842743545770645
2022-03-16 10:50:24,759.759 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:50:24,759.759 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'with', 'a', 'tub', ',', 'windows', ',', 'a', 'sink', 'and', 'a', 'spray', 'bottle', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:50:24,774.774 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', '[UNK]', 'tile', 'window', 'bathroom', 'tub', 'sink', 'bottle', 'cabinet', 'floor', 'mirror', 'knob', 'door', 'handle', 'drawer', 'white', 'soap', 'vanity', 'bath', 'drain', 'curtain', 'room', 'toilet', 'tank', 'outlet', 'label', 'light', 'frame', 'towel', 'dish', 'rack', 'shelf', 'small', 'shower', 'ledge', 'kitchen', 'counter', 'top', 'pump', 'lid', 'next', 'reflection', 'board', 'pipe', 'blind', 'glass', 'rug', 'tiled', 'cup', 'seat']
2022-03-16 10:50:40,748.748 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'door', 'floor', 'wall', 'window', 'handle', 'cabinet', 'mirror', 'bathroom', 'bottle', 'sink', 'spray', 'drawer', 'tile', 'tub', 'knob', 'vanity']
2022-03-16 10:53:04,337.337 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:23:50  iter: 13300  speed: 305.6 images/sec  total_norm: 131.4506 (133.4343)  loss: 155.2207 (157.3795)  masked_loss: 1.7986 (1.7909)  tag_loss: 153.4039 (155.5886)  time: 1.4334 (1.6751)  data: 0.0001 (0.0005)  to_device: 0.0050 (0.0048)  time_gpu: 1.4284 (1.6698)  save_time: 8.8805 (41.1344)  lr: 0.000080  max mem: 26307
2022-03-16 10:53:04,698.698 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5625
2022-03-16 10:53:04,698.698 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 159.47372436523438
2022-03-16 10:53:04,699.699 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.51962593420228
2022-03-16 10:53:12,355.355 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017834201455116272
2022-03-16 10:53:12,356.356 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:53:12,356.356 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'plate', 'of', 'food', 'including', 'meat', '[MASK]', 've', '##gg', '##ies', ',', 'and', 'grains', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:53:12,371.371 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['plate', 'food', '[UNK]', 'meat', 'table', 'mushroom', 'carrot', 'potato', 'bread', 'ham', 'beef', 'cream', 'sauce', 'sausage', 'fork', 'egg', 'tomato', 'cheese', 'knife', 'stem', 'vegetable', 'glass', 'steak', 'handle', 'pepper', 'napkin', 'cup', 'spoon', 'butter', 'fruit', 'breakfast', 'onion', 'crust', 'white', 'banana', 'garlic', 'ice', 'meal', 'bacon', 'hole', 'bowl', 'sandwich', 'bean', 'slice', 'container', 'close', 'top', 'different', 'cloth', 'chicken']
2022-03-16 10:53:28,351.351 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'water', 'table', 'food', 'glass', 'plate', 'meat', 'bread', 'egg', 'ham', 'sandwich', 'beef', 'sauce', 'mushroom', 'crust', 'shrimp', 'onion', 'carrot']
2022-03-16 10:55:51,780.780 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:21:13  iter: 13400  speed: 305.8 images/sec  total_norm: 129.1583 (133.6286)  loss: 156.1652 (155.9487)  masked_loss: 1.8205 (1.8168)  tag_loss: 154.6690 (154.1320)  time: 1.4326 (1.6744)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4274 (1.6693)  save_time: 8.8805 (41.1344)  lr: 0.000080  max mem: 26307
2022-03-16 10:55:52,142.142 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7027027010917664
2022-03-16 10:55:52,142.142 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 157.70547485351562
2022-03-16 10:55:52,143.143 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.51687588161893
2022-03-16 10:55:59,897.897 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01786199025809765
2022-03-16 10:55:59,898.898 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:55:59,898.898 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'train', 'coming', 'down', 'the', 'tracks', 'towards', 'a', 'depot', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:55:59,914.914 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['track', 'sky', 'pole', 'platform', 'tree', 'train', 'light', 'station', 'sign', '[UNK]', 'bench', 'line', 'window', 'shelter', 'wire', 'sidewalk', 'roof', 'door', 'stop', 'fence', 'ground', 'building', 'street', 'front', 'cloud', 'traffic', 'man', 'structure', 'post', 'gravel', 'windshield', 'car', 'number', 'gate', 'grass', 'bush', 'person', 'can', 'shirt', 'power', 'railroad', 'wall', 'blue', 'next', 'group', 'box', 'bus', 'trash', 'lamp', 'background']
2022-03-16 10:56:15,873.873 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'group', 'line', 'station', 'building', 'door', 'light', 'woman', 'hair', 'track', 'person', 'child', 'boy', 'structure', 'train', 'tree', 'sign', 'sky', 'shirt', 'platform', 'bag', 'clock', 'tunnel', 'hat', 'pole', 'jacket', 'bench', 'shelter', 'depot', 'shoe', 'sidewalk']
2022-03-16 10:58:39,487.487 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:18:37  iter: 13500  speed: 305.3 images/sec  total_norm: 129.2224 (132.2318)  loss: 157.3427 (159.0163)  masked_loss: 1.8069 (1.7993)  tag_loss: 155.2148 (157.2169)  time: 1.4341 (1.6771)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4288 (1.6718)  save_time: 8.8805 (41.1344)  lr: 0.000080  max mem: 26307
2022-03-16 10:58:39,848.848 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4117647111415863
2022-03-16 10:58:39,848.848 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 143.60903930664062
2022-03-16 10:58:39,848.848 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.50097319659065
2022-03-16 10:58:47,541.541 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017866848036646843
2022-03-16 10:58:47,541.541 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 10:58:47,542.542 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'this', 'is', '[MASK]', 'del', '##i', 'counter', 'with', '[MASK]', 'variety', 'of', 'meat', 'with', 'different', '[MASK]', '##s', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 10:58:47,557.557 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sandwich', '[UNK]', 'tray', 'food', 'bread', 'meat', 'cheese', 'container', 'plate', 'display', 'egg', 'cake', 'tomato', 'bowl', 'table', 'glass', 'vegetable', 'case', 'spoon', 'onion', 'fish', 'window', 'wall', 'sign', 'light', 'different', 'paper', 'cup', 'carrot', 'picture', 'ham', 'cookie', 'bunch', 'person', 'pan', 'napkin', 'leaf', 'pastry', 'knife', 'handle', 'potato', 'shelf', 'dish', 'hamburger', 'reflection', 'pizza', 'fork', 'salad', 'large', 'label']
2022-03-16 10:59:03,527.527 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'different', 'table', 'wall', 'variety', 'counter', 'meat', 'bread', 'egg', 'cheese', 'sandwich', 'tray', 'tile', 'lemon', 'vegetable', 'tomato']
2022-03-16 11:01:27,182.182 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:16:00  iter: 13600  speed: 305.3 images/sec  total_norm: 129.0708 (132.6013)  loss: 154.2439 (156.0019)  masked_loss: 1.8078 (1.8053)  tag_loss: 152.4674 (154.1966)  time: 1.4335 (1.6769)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4282 (1.6717)  save_time: 8.8805 (41.1344)  lr: 0.000080  max mem: 26307
2022-03-16 11:01:27,543.543 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 11:01:27,543.543 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 171.90118408203125
2022-03-16 11:01:27,543.543 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.51166350650091
2022-03-16 11:01:35,327.327 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017877161502838135
2022-03-16 11:01:35,327.327 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:01:35,327.327 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'this', '[MASK]', 'a', 'large', 'group', 'of', 'people', 'standing', 'ar', '##oun', '##g', 'a', 'display', 'table', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:01:35,343.343 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['person', 'man', 'hair', 'tree', 'woman', 'jacket', 'building', 'glasses', 'tent', '[UNK]', 'crowd', 'head', 'shirt', 'box', 'window', 'hat', 'scarf', 'umbrella', 'sun', 'phone', 'hand', 'girl', 'sky', 'light', 'sweater', 'bag', 'group', 'cell', 'canopy', 'pole', 'purse', 'sunglasses', 'sign', 'face', 'coat', 'backpack', 'shoe', 'roof', 'paper', 'ceiling', 'book', 'boy', 'cup', 'table', 'stand', 'suit', 'cap', 'food', 'ground', 'camera']
2022-03-16 11:01:51,257.257 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'man', 'hand', 'building', 'light', 'woman', 'ground', 'hair', 'girl', 'person', 'floor', 'table', 'seat', 'phone', 'glass', 'chair', 'tree', 'shirt', 'leg', 'plate', 'bottle', 'ceiling', 'hat', 'jacket', 'tape', 'glasses', 'purse', 'keyboard', 'tent', 'laptop', 'sweater']
2022-03-16 11:04:14,729.729 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:13:23  iter: 13700  speed: 305.6 images/sec  total_norm: 131.6675 (135.2967)  loss: 154.9007 (155.5972)  masked_loss: 1.8177 (1.8144)  tag_loss: 152.9210 (153.7828)  time: 1.4328 (1.6755)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4278 (1.6704)  save_time: 8.8805 (41.1344)  lr: 0.000079  max mem: 26307
2022-03-16 11:04:15,090.090 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5625
2022-03-16 11:04:15,090.090 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 131.4833984375
2022-03-16 11:04:15,091.091 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.56272937940514
2022-03-16 11:04:22,889.889 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0178784541785717
2022-03-16 11:04:22,889.889 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:04:22,889.889 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'zebra', 'grazing', 'alone', 'on', '[MASK]', '[MASK]', 'with', 'grass', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:04:22,905.905 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['zebra', 'leg', 'grass', 'mane', 'head', 'ear', 'eye', 'stripe', 'shadow', 'neck', 'ground', 'road', 'nose', '[UNK]', 'mouth', 'tail', 'body', 'bush', 'field', 'tree', 'face', 'path', 'gravel', 'back', 'hair', 'next', 'green', 'standing', 'side', 'snout', 'dirt', 'grassy', 'rock', 'grazing', 'leaf', 'other', 'spot', 'front', 'branch', 'patch', 'couple', 'fence', 'lone', 'tall', 'lush', 'sun', 'animal', 'flower', 'area', 'camera']
2022-03-16 11:04:38,806.806 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'road', 'field', 'mouth', 'eye', 'neck', 'path', 'leg', 'nose', 'ear', 'shadow', 'grass', 'tail', 'stripe', 'mane', 'zebra']
2022-03-16 11:07:02,324.324 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:10:46  iter: 13800  speed: 305.5 images/sec  total_norm: 128.6839 (130.4294)  loss: 152.7523 (154.7021)  masked_loss: 1.8484 (1.9045)  tag_loss: 151.0355 (152.7975)  time: 1.4326 (1.6759)  data: 0.0002 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4277 (1.6709)  save_time: 8.8805 (41.1344)  lr: 0.000079  max mem: 26307
2022-03-16 11:07:02,685.685 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 11:07:02,686.686 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 134.0855712890625
2022-03-16 11:07:02,686.686 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.58463380662657
2022-03-16 11:07:10,540.540 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017896931618452072
2022-03-16 11:07:10,540.540 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:07:10,541.541 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'this', 'is', 'a', 'picture', 'of', 'a', '[MASK]', 'sitting', 'on', '[MASK]', 'chair', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:07:10,556.556 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cat', 'ear', 'flower', 'ground', 'wall', 'leg', 'plant', 'paw', 'nose', 'head', 'bench', 'wood', 'chair', 'tree', '[UNK]', 'building', 'seat', 'pot', 'leaf', 'bush', 'wooden', 'tail', 'sidewalk', 'window', 'back', 'eye', 'rope', 'branch', 'log', 'stick', 'white', 'grass', 'trunk', 'rock', 'pole', 'next', 'post', 'board', 'floor', 'weed', 'brick', 'face', 'door', 'garden', 'dirt', 'sculpture', 'gray', 'small', 'sign', 'statue']
2022-03-16 11:07:26,496.496 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'back', 'head', 'ground', 'board', 'table', 'wall', 'seat', 'writing', 'stand', 'chair', 'plant', 'tree', 'wood', 'branch', 'sign', 'picture', 'leg', 'nose', 'ear', 'chain', 'cat', 'net', 'stick', 'flower', 'leaf', 'stem', 'hook', 'poster', 'paw']
2022-03-16 11:09:50,126.126 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:08:10  iter: 13900  speed: 305.1 images/sec  total_norm: 129.1128 (132.1185)  loss: 154.0491 (156.0853)  masked_loss: 1.8072 (1.8160)  tag_loss: 152.3961 (154.2693)  time: 1.4338 (1.6780)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4286 (1.6731)  save_time: 8.8805 (41.1344)  lr: 0.000079  max mem: 26307
2022-03-16 11:09:50,485.485 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6285714507102966
2022-03-16 11:09:50,485.485 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.5953826904297
2022-03-16 11:09:50,486.486 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.62439596993583
2022-03-16 11:09:58,342.342 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017922593280673027
2022-03-16 11:09:58,343.343 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:09:58,343.343 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'small', 'dog', 'attempts', 'to', 'grab', 'a', '[MASK]', 'fr', '##is', '[MASK]', 'with', 'it', "'", 's', 'mouth', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:09:58,359.359 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['person', 'dog', 'grass', 'leg', 'head', 'collar', 'cone', 'mouth', 'ear', 'shadow', '[UNK]', 'foot', 'ground', 'tail', 'shirt', 'eye', 'paw', 'field', 'nose', 'body', 'neck', 'hat', 'background', 'crowd', 'face', 'tree', 'man', 'park', 'object', 'flag', 'playing', 'arm', 'toy', 'tag', 'white', 'hand', 'flower', 'back', 'green', 'small', 'blue', 'air', 'spot', 'brown', 'top', 'spectator', 'woman', 'black', 'bell', 'fur']
2022-03-16 11:10:14,238.238 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'small', 'ground', 'blue', 'mouth', 'person', 'eye', 'foot', 'shirt', 'dog', 'leg', 'ear', 'shadow', 'grass', 'hat', 'tag', 'collar', 'cone', 'paw']
03-16 11:12:33.157 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 11:12:33.158 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 11:12:34.165 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 11:12:37,917.917 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:05:33  iter: 14000  speed: 305.1 images/sec  total_norm: 132.0177 (134.6876)  loss: 154.1951 (153.9942)  masked_loss: 1.7925 (1.8280)  tag_loss: 152.9146 (152.1662)  time: 1.4323 (1.6779)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4273 (1.6729)  save_time: 8.8805 (41.1344)  lr: 0.000079  max mem: 26307
2022-03-16 11:12:38,278.278 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 11:12:38,278.278 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 125.2453384399414
2022-03-16 11:12:38,278.278 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.6564180631164
2022-03-16 11:12:46,298.298 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01791957952082157
2022-03-16 11:12:46,299.299 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:12:46,299.299 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'car', 'is', 'parked', '[MASK]', '[MASK]', 'a', 'curb', 'with', 'its', 'brake', 'lights', 'on', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:12:46,314.314 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'light', 'grass', 'road', 'car', 'street', 'tree', 'sidewalk', 'pole', 'building', 'line', 'traffic', 'person', 'wall', 'fence', 'house', 'wire', '[UNK]', 'sign', 'tire', 'roof', 'van', 'window', 'brick', 'fire', 'license', 'cloud', 'curb', 'pillar', 'lawn', 'plate', 'city', 'bush', 'man', 'arrow', 'windshield', 'chimney', 'back', 'intersection', 'tail', 'truck', 'stop', 'bag', 'green', 'post', 'town', 'box', 'suv', 'shirt', 'column']
2022-03-16 11:13:02,302.302 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'house', 'line', 'next', 'building', 'road', 'street', 'light', 'car', 'person', 'wall', 'tree', 'sky', 'traffic', 'brick', 'grass', 'column', 'cloud', 'pole', 'wire', 'sidewalk', 'brake', 'curb', 'pillar', 'bumper']
2022-03-16 11:15:25,879.879 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:02:57  iter: 14100  speed: 304.8 images/sec  total_norm: 130.9817 (132.8071)  loss: 154.9062 (155.3979)  masked_loss: 1.7900 (1.8193)  tag_loss: 152.6038 (153.5786)  time: 1.4333 (1.6796)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4281 (1.6744)  save_time: 8.8805 (41.1344)  lr: 0.000079  max mem: 26307
2022-03-16 11:15:26,240.240 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6875
2022-03-16 11:15:26,241.241 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 171.593505859375
2022-03-16 11:15:26,241.241 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.66850146441392
2022-03-16 11:15:34,201.201 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017911667004227638
2022-03-16 11:15:34,201.201 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:15:34,201.201 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'elephants', 'with', '[MASK]', 'long', 'tu', '##sk', '##s', 'standing', 'at', 'outing', 'wall', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:15:34,216.216 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['elephant', 'trunk', 'tree', 'grass', 'eye', 'ear', 'leg', 'head', 'sky', '[UNK]', 'man', 'person', 'ground', 'rock', 'shirt', 'jacket', 'wall', 'structure', 'foot', 'short', 'bush', 'roof', 'fence', 'mouth', 'zoo', 'water', 'dirt', 'box', 'building', 'path', 'tank', 'large', 'hat', 'barrel', 'leaf', 'hair', 'couple', 'can', 'field', 'post', 'log', 'stick', 'stump', 'woman', 'chain', 'container', 'front', 'top', 'next', 'tail']
2022-03-16 11:15:50,198.198 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'long', 'man', 'ground', 'rock', 'person', 'wall', 'eye', 'foot', 'tree', 'sky', 'shirt', 'leg', 'ear', 'grass', 'dirt', 'trunk', 'elephant', 'container']
2022-03-16 11:18:13,681.681 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:00:20  iter: 14200  speed: 305.1 images/sec  total_norm: 132.7275 (135.7271)  loss: 154.6880 (155.5748)  masked_loss: 1.7193 (1.7565)  tag_loss: 152.7029 (153.8183)  time: 1.4329 (1.6780)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4277 (1.6728)  save_time: 8.8805 (41.1344)  lr: 0.000079  max mem: 26307
2022-03-16 11:18:14,044.044 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.625
2022-03-16 11:18:14,045.045 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.77853393554688
2022-03-16 11:18:14,045.045 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.67060537271567
2022-03-16 11:18:22,091.091 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.017940057441592216
2022-03-16 11:18:22,091.091 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:18:22,092.092 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bed', 'has', 'a', 'wooden', 'ɒ', 'and', 'a', 'brown', 'blanket', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:18:22,107.107 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bed', 'floor', 'sheet', 'blanket', 'towel', 'wall', 'mattress', '[UNK]', 'room', 'carpet', 'chair', 'pillow', 'paper', 'frame', 'window', 'leg', 'shadow', 'table', 'bar', 'curtain', 'drawer', 'door', 'bag', 'railing', 'rack', 'person', 'bedroom', 'rail', 'post', 'bunk', 'light', 'wooden', 'cushion', 'clothes', 'dresser', 'backpack', 'cabinet', 'shade', 'suitcase', 'top', 'ladder', 'handle', 'bench', 'pole', 'large', 'wood', 'book', 'lamp', 'furniture', 'outlet']
2022-03-16 11:18:38,067.067 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['room', 'door', 'person', 'floor', 'bed', 'wall', 'brown', 'paper', 'bar', 'leg', 'wooden', 'frame', 'handle', 'shadow', 'doorway', 'sheet', 'furniture', 'blanket', 'pillow', 'carpet', 'towel', 'drawer', 'mattress', 'rack', 'dresser']
2022-03-16 11:21:01,559.559 2829:trainer.py:487 do_train_dict(): eta: 23:57:43  iter: 14300  speed: 305.0 images/sec  total_norm: 130.4659 (135.9736)  loss: 153.5232 (156.5875)  masked_loss: 1.7260 (1.7753)  tag_loss: 151.6988 (154.8122)  time: 1.4333 (1.6788)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4282 (1.6737)  save_time: 8.8805 (41.1344)  lr: 0.000078  max mem: 26307
2022-03-16 11:21:01,925.925 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.34375
2022-03-16 11:21:01,925.925 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 115.0206298828125
2022-03-16 11:21:01,925.925 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.71441623899672
2022-03-16 11:21:10,006.006 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01796768233180046
2022-03-16 11:21:10,007.007 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:21:10,007.007 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'surfer', 'walking', 'along', '[MASK]', 'sidewalk', '[MASK]', 'a', 'beach', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:21:10,022.022 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'window', 'pole', 'fence', 'railing', '[UNK]', 'shirt', 'wall', 'man', 'head', 'ground', 'sign', 'person', 'leg', 'tree', 'shoe', 'arm', 'logo', 'car', 'boy', 'roof', 'balcony', 'hand', 'hat', 'sky', 'light', 'hair', 'box', 'helmet', 'jean', 'trash', 'door', 'grass', 'stair', 'short', 'can', 'jacket', 'banner', 'dirt', 'wheel', 'woman', 'tail', 'sand', 'rail', 'line', 'flag', 'step', 'sidewalk', 'bin', 'bench']
2022-03-16 11:21:25,871.871 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'building', 'cup', 'short', 'ground', 'rock', 'board', 'chair', 'window', 'step', 'box', 'beach', 'shirt', 'leg', 'gate', 'sand', 'hat', 'pole', 'fence', 'balcony', 'sidewalk', 'railing', 'grill', 'surfer']
2022-03-16 11:23:49,600.600 2829:trainer.py:487 do_train_dict(): eta: 23:55:06  iter: 14400  speed: 304.7 images/sec  total_norm: 128.5652 (129.8504)  loss: 152.8820 (153.2037)  masked_loss: 1.7608 (1.7700)  tag_loss: 150.4307 (151.4338)  time: 1.4338 (1.6804)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0049)  time_gpu: 1.4286 (1.6750)  save_time: 8.8805 (41.1344)  lr: 0.000078  max mem: 26307
2022-03-16 11:23:49,960.960 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-16 11:23:49,960.960 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.3422393798828
2022-03-16 11:23:49,960.960 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.74001364872373
2022-03-16 11:23:58,086.086 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018045194447040558
2022-03-16 11:23:58,087.087 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:23:58,087.087 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'yellow', 'cat', 'laying', 'down', 'with', 'a', 'white', '[MASK]', 'on', 'its', 'head', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:23:58,102.102 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['nose', 'eye', 'hat', 'head', 'face', 'blanket', 'wall', '[UNK]', 'bed', 'flower', 'cat', 'book', 'fur', 'towel', 'tail', 'light', 'hair', 'room', 'ceiling', 'table', 'lamp', 'white', 'body', 'pillow', 'frame', 'picture', 'chair', 'mouth', 'chest', 'leg', 'shelf', 'feather', 'animal', 'curtain', 'mirror', 'design', 'window', 'nightstand', 'person', 'top', 'fluffy', 'dog', 'sheet', 'cushion', 'button', 'couch', 'background', 'large', 'stripe', 'star']
2022-03-16 11:24:14,140.140 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'body', 'white', 'bed', 'wall', 'eye', 'rail', 'nose', 'frame', 'cat', 'speaker', 'ceiling', 'hat', 'blanket', 'towel', 'stripe']
2022-03-16 11:26:37,616.616 2829:trainer.py:487 do_train_dict(): eta: 23:52:30  iter: 14500  speed: 304.7 images/sec  total_norm: 130.3180 (131.9520)  loss: 154.3104 (156.0449)  masked_loss: 1.6773 (1.6853)  tag_loss: 152.8525 (154.3596)  time: 1.4331 (1.6802)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4277 (1.6749)  save_time: 8.8805 (41.1344)  lr: 0.000078  max mem: 26307
2022-03-16 11:26:37,976.976 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 11:26:37,976.976 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 176.95310974121094
2022-03-16 11:26:37,977.977 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.74564669883415
2022-03-16 11:26:46,123.123 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018046459183096886
2022-03-16 11:26:46,124.124 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:26:46,124.124 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'parade', 'of', 'policemen', 'on', 'motorcycles', 'are', 'escorting', '[MASK]', 'bus', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:26:46,139.139 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['motorcycle', 'tire', 'road', 'light', 'windshield', '[UNK]', 'man', 'street', 'shadow', 'helmet', 'tree', 'person', 'bike', 'window', 'building', 'car', 'sign', 'pole', 'truck', 'mirror', 'line', 'wheel', 'head', 'license', 'shirt', 'plate', 'bus', 'sky', 'horn', 'grass', 'flag', 'fence', 'hat', 'police', 'jacket', 'bush', 'jean', 'officer', 'woman', 'hair', 'parade', 'front', 'traffic', 'bag', 'roof', 'sidewalk', 'van', 'stripe', 'curb', 'ladder']
2022-03-16 11:27:02,035.035 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'line', 'building', 'road', 'street', 'light', 'car', 'person', 'officer', 'tree', 'sign', 'sky', 'bus', 'vehicle', 'truck', 'plate', 'shadow', 'license', 'pole', 'parade', 'motorcycle', 'helmet', 'tire', 'policeman', 'windshield']
2022-03-16 11:29:25,702.702 2829:trainer.py:487 do_train_dict(): eta: 23:49:53  iter: 14600  speed: 304.6 images/sec  total_norm: 132.5196 (133.7473)  loss: 157.6348 (158.9924)  masked_loss: 1.8264 (1.7979)  tag_loss: 155.2544 (157.1945)  time: 1.4330 (1.6808)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4279 (1.6757)  save_time: 8.8805 (41.1344)  lr: 0.000078  max mem: 26307
2022-03-16 11:29:26,063.063 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-16 11:29:26,063.063 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 159.8418731689453
2022-03-16 11:29:26,063.063 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.7567613562759
2022-03-16 11:29:34,259.259 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018040377646684647
2022-03-16 11:29:34,259.259 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:29:34,259.259 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', '[MASK]', 'people', 'are', 'gathered', 'together', 'in', 'the', 'living', 'room', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:29:34,275.275 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'glasses', 'woman', 'person', 'curtain', 'shirt', 'hand', 'wall', 'head', 'room', 'hat', 'man', 'cup', 'laptop', 'girl', 'face', 'couch', 'table', 'jean', 'window', 'jacket', 'computer', '[UNK]', 'group', 'television', 'chair', 'blanket', 'boy', 'sweater', 'food', 'screen', 'glass', 'pillow', 'ear', 'keyboard', 'picture', 'bed', 'plate', 'coffee', 'floor', 'bag', 'shoe', 'monitor', 'book', 'lid', 'box', 'cap', 'watch', 'ponytail', 'light']
2022-03-16 11:29:50,166.166 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'room', 'book', 'woman', 'cup', 'living', 'hair', 'girl', 'person', 'floor', 'table', 'wall', 'glass', 'chair', 'plant', 'watch', 'box', 'jean', 'shirt', 'screen', 'kid', 'speaker', 'hat', 'couch', 'jacket', 'globe', 'glasses', 'pillow', 'curtain', 'shelf', 'laptop', 'scarf']
2022-03-16 11:32:13,736.736 2829:trainer.py:487 do_train_dict(): eta: 23:47:16  iter: 14700  speed: 304.7 images/sec  total_norm: 131.3952 (134.9033)  loss: 152.5495 (154.7191)  masked_loss: 1.7493 (1.7666)  tag_loss: 150.5648 (152.9524)  time: 1.4330 (1.6803)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4280 (1.6752)  save_time: 8.8805 (41.1344)  lr: 0.000078  max mem: 26307
2022-03-16 11:32:14,098.098 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7058823704719543
2022-03-16 11:32:14,098.098 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 193.5040283203125
2022-03-16 11:32:14,099.099 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.73174600343447
2022-03-16 11:32:22,380.380 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01804724894464016
2022-03-16 11:32:22,380.380 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:32:22,380.380 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'room', 'is', 'in', 'the', 'process', '[MASK]', 'being', 'renovated', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:32:22,396.396 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'floor', 'book', 'table', 'outlet', 'room', 'cord', '[UNK]', 'television', 'shelf', 'chair', 'couch', 'bag', 'light', 'living', 'ground', 'coffee', 'speaker', 'plug', 'window', 'box', 'remote', 'switch', 'bottle', 'stand', 'wire', 'cup', 'screen', 'sofa', 'lamp', 'magazine', 'frame', 'wii', 'leg', 'paper', 'tv', 'controller', 'laptop', 'can', 'top', 'door', 'ceiling', 'candle', 'control', 'desk', 'toy', 'pillow', 'dvd', 'phone', 'fire']
2022-03-16 11:32:38,334.334 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'room', 'book', 'door', 'living', 'television', 'ground', 'floor', 'table', 'wall', 'process', 'chair', 'coffee', 'bag', 'device', 'bottle', 'speaker', 'couch', 'switch', 'doorway', 'closet', 'cord', 'outlet', 'candle', 'socket']
2022-03-16 11:35:01,893.893 2829:trainer.py:487 do_train_dict(): eta: 23:44:39  iter: 14800  speed: 304.5 images/sec  total_norm: 129.2645 (131.0651)  loss: 154.9791 (155.8498)  masked_loss: 1.6983 (1.7583)  tag_loss: 152.9878 (154.0915)  time: 1.4332 (1.6816)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4282 (1.6765)  save_time: 8.8805 (41.1344)  lr: 0.000078  max mem: 26307
2022-03-16 11:35:02,254.254 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.3888888955116272
2022-03-16 11:35:02,254.254 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 175.73284912109375
2022-03-16 11:35:02,254.254 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.71537181675033
2022-03-16 11:35:10,513.513 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018120817840099335
2022-03-16 11:35:10,513.513 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:35:10,514.514 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'person', '[MASK]', 'is', 'skiing', 'in', 'a', 'des', '##olate', '[MASK]', 'snowy', 'area', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:35:10,529.529 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'jacket', 'ski', 'snow', 'helmet', 'ground', '[UNK]', 'lift', 'tree', 'cloud', 'person', 'child', 'pole', 'coat', 'wire', 'boy', 'glove', 'chair', 'head', 'building', 'kid', 'girl', 'boot', 'man', 'skier', 'track', 'mountain', 'hand', 'hat', 'light', 'tower', 'sign', 'fence', 'slope', 'hill', 'shadow', 'foot', 'line', 'roof', 'car', 'shoe', 'leg', 'snowy', 'window', 'house', 'cable', 'bench', 'group', 'background', 'arm']
2022-03-16 11:35:26,477.477 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'area', 'line', 'building', 'road', 'ground', 'person', 'child', 'sun', 'mountain', 'tree', 'sky', 'leg', 'background', 'snow', 'cloud', 'lift', 'pole', 'jacket', 'wire', 'ski', 'helmet', 'glove', 'snowy']
2022-03-16 11:37:50,121.121 2829:trainer.py:487 do_train_dict(): eta: 23:42:03  iter: 14900  speed: 304.4 images/sec  total_norm: 130.1726 (133.2278)  loss: 156.5107 (156.8679)  masked_loss: 1.6728 (1.7032)  tag_loss: 154.7569 (155.1647)  time: 1.4331 (1.6823)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4281 (1.6772)  save_time: 8.8805 (41.1344)  lr: 0.000078  max mem: 26307
2022-03-16 11:37:50,482.482 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5833333134651184
2022-03-16 11:37:50,482.482 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 182.0166015625
2022-03-16 11:37:50,482.482 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.67814407348632
2022-03-16 11:37:58,796.796 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01809064671397209
2022-03-16 11:37:58,797.797 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:37:58,797.797 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'sign', 'on', 'a', 'street', 'post', '[MASK]', 'smiling', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:37:58,812.812 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sign', 'building', 'window', 'sky', 'letter', 'pole', 'wall', '[UNK]', 'arrow', 'street', 'light', 'city', 'banner', 'writing', 'word', 'store', 'traffic', 'flag', 'air', 'cloud', 'side', 'escape', 'roof', 'tall', 'tree', 'large', 'different', 'logo', 'corner', 'skyscraper', 'door', 'front', 'lamp', 'circle', 'number', 'tower', 'balcony', 'many', 'way', 'line', 'person', 'red', 'car', 'hand', 'blue', 'next', 'bus', 'view', 'brick', 'glass']
2022-03-16 11:38:14,781.781 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'building', 'street', 'post', 'word', 'wall', 'window', 'letter', 'sign', 'sky', 'arrow', 'banner', 'bolt', 'screw']
2022-03-16 11:40:38,449.449 2829:trainer.py:487 do_train_dict(): eta: 23:39:27  iter: 15000  speed: 304.2 images/sec  total_norm: 130.9289 (133.9651)  loss: 153.5126 (156.2343)  masked_loss: 1.7264 (1.7388)  tag_loss: 151.7011 (154.4955)  time: 1.4340 (1.6833)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4288 (1.6782)  save_time: 8.8805 (41.1344)  lr: 0.000077  max mem: 26307
2022-03-16 11:40:38,451.451 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt
2022-03-16 11:40:47,847.847 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4000000059604645
2022-03-16 11:40:47,848.848 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 154.01416015625
2022-03-16 11:40:47,848.848 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.68138496449451
2022-03-16 11:40:56,230.230 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0181103702634573
2022-03-16 11:40:56,230.230 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:40:56,230.230 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'some', '[MASK]', 'playing', 'with', 'kite', '[MASK]', 'on', 'the', 'beach', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:40:56,245.245 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'kite', 'car', 'tree', 'person', 'jean', 'ground', 'house', 'man', 'string', 'tail', 'lot', 'building', 'shadow', 'shirt', 'parking', 'jacket', '[UNK]', 'bag', 'cloud', 'beach', 'pole', 'snow', 'roof', 'flag', 'child', 'woman', 'sand', 'hair', 'coat', 'backpack', 'air', 'truck', 'head', 'chair', 'hat', 'park', 'umbrella', 'van', 'sign', 'background', 'suv', 'ski', 'tent', 'balloon', 'wheel', 'vehicle', 'parachute', 'top', 'pile']
2022-03-16 11:41:12,097.097 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['man', 'house', 'building', 'car', 'ground', 'person', 'lot', 'tree', 'beach', 'sky', 'jean', 'roof', 'bag', 'snow', 'truck', 'string', 'flag', 'parking', 'tail', 'cloud', 'jacket', 'umbrella', 'backpack', 'kite']
03-16 11:42:34.197 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 11:42:34.197 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 11:42:35.427 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 11:43:35,193.193 2829:trainer.py:487 do_train_dict(): eta: 23:37:19  iter: 15100  speed: 289.7 images/sec  total_norm: 131.6978 (133.8174)  loss: 155.8946 (155.8844)  masked_loss: 1.7415 (1.7408)  tag_loss: 154.0142 (154.1436)  time: 1.4346 (1.7674)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4294 (1.6720)  save_time: 9.0279 (30.4322)  lr: 0.000077  max mem: 26307
2022-03-16 11:43:35,555.555 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-16 11:43:35,555.555 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 165.35296630859375
2022-03-16 11:43:35,555.555 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.66675457201507
2022-03-16 11:43:43,960.960 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018129440024495125
2022-03-16 11:43:43,960.960 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:43:43,961.961 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'snow', 'board', 'sticking', 'out', 'of', 'the', 'deep', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:43:43,976.976 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'sky', 'snow', 'fence', '[UNK]', 'ground', 'mountain', 'cloud', 'person', 'boot', 'board', 'ski', 'shoe', 'helmet', 'background', 'jacket', 'glove', 'foot', 'pole', 'man', 'building', 'leaf', 'head', 'pine', 'blue', 'branch', 'hand', 'arm', 'hill', 'track', 'leg', 'shadow', 'bag', 'face', 'coat', 'hat', 'rock', 'top', 'trunk', 'bush', 'backpack', 'grass', 'snowy', 'logo', 'strap', 'design', 'boy', 'plant', 'stick', 'roof']
2022-03-16 11:43:59,791.791 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'ground', 'board', 'deep', 'mountain', 'tree', 'sky', 'snow', 'bush', 'cloud', 'trunk', 'fence', 'helmet', 'shoe', 'strap']
2022-03-16 11:46:23,605.605 2829:trainer.py:487 do_train_dict(): eta: 23:34:42  iter: 15200  speed: 304.0 images/sec  total_norm: 129.5658 (132.0758)  loss: 155.2314 (155.5071)  masked_loss: 1.8529 (1.8322)  tag_loss: 153.0607 (153.6749)  time: 1.4332 (1.6842)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4280 (1.6791)  save_time: 9.0279 (30.4322)  lr: 0.000077  max mem: 26307
2022-03-16 11:46:23,966.966 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7272727489471436
2022-03-16 11:46:23,966.966 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 142.22222900390625
2022-03-16 11:46:23,967.967 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.69110715778825
2022-03-16 11:46:32,409.409 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018128776922822
2022-03-16 11:46:32,409.409 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:46:32,409.409 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'programmes', 'sits', 'in', 'front', 'of', 'a', '[MASK]', 'and', 'a', 'counter', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:46:32,425.425 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', '[UNK]', 'shelf', 'sink', 'towel', 'window', 'bathroom', 'bottle', 'cup', 'container', 'cross', 'soap', 'box', 'lid', 'mirror', 'handle', 'can', 'star', 'ledge', 'candle', 'tile', 'bag', 'pipe', 'door', 'hook', 'floor', 'ceiling', 'paper', 'toilet', 'sponge', 'holder', 'basket', 'light', 'robe', 'cabinet', 'curtain', 'book', 'tank', 'tub', 'reflection', 'tissue', 'bowl', 'trash', 'frame', 'white', 'cord', 'rack', 'flag', 'roll', 'jar']
2022-03-16 11:46:48,365.365 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'front', 'light', 'cup', 'floor', 'wall', 'cross', 'window', 'sign', 'bag', 'counter', 'mirror', 'bathroom', 'bottle', 'ceiling', 'sink', 'soap', 'keyboard', 'holder', 'towel', 'ribbon', 'shelf', 'container', 'drain', 'tile', 'glove', 'ledge', 'comb']
2022-03-16 11:49:11,908.908 2829:trainer.py:487 do_train_dict(): eta: 23:32:05  iter: 15300  speed: 304.2 images/sec  total_norm: 135.8611 (139.7964)  loss: 148.9173 (150.7546)  masked_loss: 1.7634 (1.7232)  tag_loss: 147.4116 (149.0314)  time: 1.4327 (1.6830)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4275 (1.6778)  save_time: 9.0279 (30.4322)  lr: 0.000077  max mem: 26307
2022-03-16 11:49:12,270.270 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-16 11:49:12,271.271 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 178.60105895996094
2022-03-16 11:49:12,271.271 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.68669366217279
2022-03-16 11:49:20,858.858 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01814165525138378
2022-03-16 11:49:20,859.859 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:49:20,859.859 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'lady', 'in', 'dark', 'clothes', 'with', '[MASK]', 'dark', 'bag', 'and', 'a', '[MASK]', 'bell', 'umbrella', 'is', 'standing', '[MASK]', 'neatly', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:49:20,875.875 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'umbrella', 'tree', 'sky', 'fence', 'person', 'woman', 'park', 'jacket', 'hand', '[UNK]', 'field', 'bag', 'coat', 'handle', 'car', 'sidewalk', 'background', 'pole', 'purse', 'hair', 'bench', 'leg', 'ground', 'road', 'building', 'shoe', 'girl', 'lady', 'head', 'shirt', 'trash', 'can', 'strap', 'light', 'jean', 'cloud', 'leaf', 'post', 'boot', 'green', 'rain', 'hood', 'dirt', 'trunk', 'man', 'bush', 'dress', 'path', 'arm']
2022-03-16 11:49:36,902.902 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'hand', 'park', 'woman', 'dark', 'blue', 'post', 'person', 'lady', 'tree', 'sky', 'leg', 'bell', 'bag', 'handle', 'coat', 'grass', 'pole', 'fence', 'trash', 'umbrella', 'fencing']
2022-03-16 11:52:00,579.579 2829:trainer.py:487 do_train_dict(): eta: 23:29:29  iter: 15400  speed: 303.6 images/sec  total_norm: 132.1841 (136.1458)  loss: 153.2925 (154.6275)  masked_loss: 1.6942 (1.7564)  tag_loss: 151.1323 (152.8711)  time: 1.4336 (1.6867)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4283 (1.6815)  save_time: 9.0279 (30.4322)  lr: 0.000077  max mem: 26307
2022-03-16 11:52:00,940.940 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-16 11:52:00,940.940 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 131.2457275390625
2022-03-16 11:52:00,940.940 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.71586608886719
2022-03-16 11:52:09,561.561 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018207857385277748
2022-03-16 11:52:09,561.561 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:52:09,562.562 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'snow', 'board', '##er', 'riding', '[MASK]', 'a', 'snow', '[MASK]', 'summit', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:52:09,577.577 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'snow', '[UNK]', 'man', 'jacket', 'ground', 'hat', 'person', 'glove', 'sky', 'coat', 'head', 'arm', 'hand', 'snowy', 'leg', 'pole', 'helmet', 'foot', 'ski', 'track', 'board', 'hill', 'slope', 'boot', 'cap', 'mountain', 'shoe', 'group', 'hood', 'day', 'skier', 'backpack', 'area', 'pine', 'trunk', 'face', 'side', 'cloud', 'sun', 'top', 'winter', 'poles', 'couple', 'woman', 'sign', 'forest', 'building', 'footprint', 'country']
2022-03-16 11:52:25,528.528 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'ground', 'rock', 'arm', 'mountain', 'covered', 'tree', 'sky', 'snow', 'coat', 'hat', 'summit', 'jacket', 'glove']
2022-03-16 11:54:49,147.147 2829:trainer.py:487 do_train_dict(): eta: 23:26:52  iter: 15500  speed: 303.7 images/sec  total_norm: 131.8490 (134.0243)  loss: 155.5661 (156.3271)  masked_loss: 1.6858 (1.7185)  tag_loss: 153.8783 (154.6086)  time: 1.4333 (1.6857)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0049)  time_gpu: 1.4283 (1.6803)  save_time: 9.0279 (30.4322)  lr: 0.000077  max mem: 26307
2022-03-16 11:54:49,507.507 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6571428775787354
2022-03-16 11:54:49,507.507 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 162.46498107910156
2022-03-16 11:54:49,507.507 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.74902774126102
2022-03-16 11:54:58,154.154 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018178725615143776
2022-03-16 11:54:58,155.155 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:54:58,155.155 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'driver', 'examining', 'a', 'minor', 'traffic', 'accident', 'between', 'a', 'bus', 'and', 'a', 'car', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:54:58,170.170 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['jean', 'road', 'man', 'plate', 'bus', 'license', 'street', '[UNK]', 'jacket', 'windshield', 'sign', 'window', 'shoe', 'sky', 'light', 'tire', 'person', 'shadow', 'shirt', 'building', 'leg', 'car', 'hair', 'mirror', 'tree', 'bumper', 'ladder', 'rack', 'number', 'hat', 'pole', 'line', 'wheel', 'woman', 'truck', 'head', 'stripe', 'sidewalk', 'flag', 'cloud', 'letter', 'bag', 'curb', 'van', 'hand', 'logo', 'city', 'front', 'fence', 'door']
2022-03-16 11:55:14,067.067 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'number', 'line', 'building', 'road', 'street', 'light', 'woman', 'car', 'hair', 'wall', 'window', 'tree', 'store', 'minor', 'sign', 'sky', 'jean', 'shirt', 'bus', 'traffic', 'driver', 'leg', 'accident', 'plate', 'mirror', 'coat', 'license', 'cloud', 'jacket', 'logo', 'shoe', 'sidewalk', 'tire', 'curb', 'grill', 'windshield', 'bumper']
2022-03-16 11:57:37,738.738 2829:trainer.py:487 do_train_dict(): eta: 23:24:15  iter: 15600  speed: 303.7 images/sec  total_norm: 130.5420 (133.6573)  loss: 153.7613 (154.1538)  masked_loss: 1.6774 (1.7158)  tag_loss: 152.5827 (152.4379)  time: 1.4335 (1.6859)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4282 (1.6808)  save_time: 9.0279 (30.4322)  lr: 0.000077  max mem: 26307
2022-03-16 11:57:38,101.101 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5625
2022-03-16 11:57:38,101.101 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 164.8413543701172
2022-03-16 11:57:38,101.101 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.73642103839073
2022-03-16 11:57:46,803.803 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01817481964826584
2022-03-16 11:57:46,804.804 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 11:57:46,804.804 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'large', 'black', 'bird', 'perched', 'next', 'to', 'an', 'outdoor', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 11:57:46,819.819 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['clock', 'building', 'window', 'hand', 'bird', 'number', 'statue', 'sculpture', 'beak', 'wing', 'sky', 'head', '[UNK]', 'horse', 'large', 'face', 'wall', 'feather', 'tail', 'skyscraper', 'foot', 'neck', 'roof', 'leg', 'pole', 'tall', 'dinosaur', 'tree', 'front', 'light', 'black', 'reflection', 'top', 'hour', 'eagle', 'mouth', 'animal', 'white', 'sign', 'person', 'shadow', 'next', 'big', 'line', 'stair', 'man', 'glass', 'tower', 'nest', 'balcony']
2022-03-16 11:58:02,705.705 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'hand', 'number', 'next', 'black', 'building', 'large', 'window', 'wing', 'horse', 'sky', 'bird', 'clock', 'tail', 'statue', 'outdoor', 'nest', 'balcony', 'feathers', 'feather', 'beak']
2022-03-16 12:00:26,506.506 2829:trainer.py:487 do_train_dict(): eta: 23:21:39  iter: 15700  speed: 303.4 images/sec  total_norm: 133.9581 (138.1694)  loss: 153.1067 (155.1483)  masked_loss: 1.6834 (1.7285)  tag_loss: 151.2865 (153.4198)  time: 1.4345 (1.6877)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4291 (1.6825)  save_time: 9.0279 (30.4322)  lr: 0.000076  max mem: 26307
2022-03-16 12:00:26,868.868 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5625
2022-03-16 12:00:26,869.869 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 143.30093383789062
2022-03-16 12:00:26,869.869 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.7733531420744
2022-03-16 12:00:35,560.560 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018227895721793175
2022-03-16 12:00:35,560.560 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:00:35,561.561 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'this', 'is', 'a', '[MASK]', '##board', '##er', '[MASK]', 'a', 'dangerous', 'trick', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:00:35,576.576 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'tree', 'arm', 'short', 'shirt', 'man', '[UNK]', 'step', 'hand', 'stair', 'boy', 'shoe', 'railing', 'head', 'leg', 'rail', 'cloud', 'wall', 'wheel', 'mountain', 'park', 'hair', 'hat', 'person', 'ramp', 'palm', 'building', 'woman', 'top', 'skate', 'ledge', 'board', 'ground', 'air', 'bush', 'watch', 'trick', 'bench', 'grass', 'sunglasses', 'cap', 'young', 'sign', 'foot', 'girl', 'tank', 'concrete', 'face', 'roof', 'sidewalk']
2022-03-16 12:00:51,484.484 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'top', 'park', 'short', 'hair', 'person', 'wall', 'arm', 'mountain', 'couple', 'foot', 'step', 'tree', 'sky', 'shirt', 'leg', 'dangerous', 'tank', 'wheel', 'cloud', 'pole', 'bench', 'trick', 'fence', 'shoe', 'ramp', 'skate', 'stair']
2022-03-16 12:03:15,144.144 2829:trainer.py:487 do_train_dict(): eta: 23:19:02  iter: 15800  speed: 303.6 images/sec  total_norm: 129.6254 (134.2665)  loss: 150.7224 (155.0574)  masked_loss: 1.7302 (1.7660)  tag_loss: 148.7880 (153.2914)  time: 1.4334 (1.6864)  data: 0.0002 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4287 (1.6814)  save_time: 9.0279 (30.4322)  lr: 0.000076  max mem: 26307
2022-03-16 12:03:15,505.505 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-16 12:03:15,505.505 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 170.1243896484375
2022-03-16 12:03:15,505.505 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.77492259283486
2022-03-16 12:03:24,245.245 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018213197588920593
2022-03-16 12:03:24,245.245 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:03:24,245.245 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'couple', 'of', 'orange', '##s', 'are', 'on', 'a', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:03:24,260.260 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['book', 'table', 'mouse', 'orange', 'apple', 'tomato', 'paper', 'plate', 'fruit', 'background', 'stack', '[UNK]', 'magazine', 'phone', 'stem', 'bowl', 'peach', 'napkin', 'desk', 'button', 'pen', 'pile', 'computer', 'pad', 'top', 'remote', 'wire', 'spoon', 'glass', 'cord', 'food', 'ball', 'wall', 'tray', 'laptop', 'writing', 'banana', 'screen', 'light', 'item', 'handle', 'container', 'pencil', 'reflection', 'keyboard', 'next', 'logo', 'basket', 'picture', 'case']
2022-03-16 12:03:40,210.210 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'book', 'table', 'phone', 'couple', 'paper', 'computer', 'cell', 'background', 'orange', 'bowl', 'plate', 'fruit', 'apple', 'button', 'remote', 'stem', 'mouse', 'cloth', 'sauce', 'spoon', 'peach', 'tomato']
2022-03-16 12:06:03,832.832 2829:trainer.py:487 do_train_dict(): eta: 23:16:26  iter: 15900  speed: 303.5 images/sec  total_norm: 134.2012 (137.3562)  loss: 156.0555 (156.1601)  masked_loss: 1.6816 (1.7332)  tag_loss: 154.2530 (154.4269)  time: 1.4333 (1.6869)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4280 (1.6817)  save_time: 9.0279 (30.4322)  lr: 0.000076  max mem: 26307
2022-03-16 12:06:04,197.197 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-16 12:06:04,197.197 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 147.95608520507812
2022-03-16 12:06:04,197.197 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.7871949672699
2022-03-16 12:06:13,055.055 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018209436908364296
2022-03-16 12:06:13,056.056 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:06:13,056.056 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'young', 'boy', 'holding', 'a', 'skate', '##board', 'on', '[MASK]', 'side', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:06:13,071.071 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'shirt', 'grass', 'wheel', 'sidewalk', 'car', 'boy', 'arm', 'shoe', 'shadow', 'hand', 'ground', 'design', 'pad', 'head', 'street', 'window', 'road', 'face', 'tree', 'hair', 'helmet', 'curb', 'band', 'man', 'building', 'truck', 'person', 'glove', 'light', 'tire', 'plate', 'door', 'nose', 'park', 'eye', 'strap', 'leaf', 'license', 'jean', 'mouth', 'house', 'pole', 'bush', 'wall', 'fence', 'logo', 'line', 'elbow', 'board']
2022-03-16 12:06:29,001.001 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'house', 'hand', 'side', 'face', 'building', 'door', 'road', 'street', 'young', 'light', 'car', 'ground', 'hair', 'arm', 'boy', 'eye', 'walk', 'window', 'tree', 'sky', 'shirt', 'shadow', 'wheel', 'grass', 'tail', 'shoe', 'sidewalk', 'glove']
2022-03-16 12:08:52,669.669 2829:trainer.py:487 do_train_dict(): eta: 23:13:49  iter: 16000  speed: 303.3 images/sec  total_norm: 130.1435 (133.7759)  loss: 152.1563 (154.6496)  masked_loss: 1.7270 (1.7722)  tag_loss: 150.6622 (152.8774)  time: 1.4337 (1.6883)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4287 (1.6832)  save_time: 9.0279 (30.4322)  lr: 0.000076  max mem: 26307
2022-03-16 12:08:53,029.029 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5135135054588318
2022-03-16 12:08:53,029.029 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 165.96426391601562
2022-03-16 12:08:53,029.029 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.81289227408652
2022-03-16 12:09:01,889.889 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01821252889931202
2022-03-16 12:09:01,889.889 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:09:01,889.889 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'is', 'being', '[MASK]', 'slicing', 'a', 'birthday', '[MASK]', 'while', 'another', 'man', 'is', 'staring', '[MASK]', 'him', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:09:01,905.905 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'wall', 'shirt', 'head', 'hand', 'hair', 'ear', 'flower', 'jacket', 'glasses', 'woman', 'person', '[UNK]', 'floor', 'carpet', 'jean', 'room', 'box', 'coat', 'shoe', 'table', 'cake', 'face', 'dress', 'switch', 'chair', 'plate', 'light', 'outlet', 'heart', 'arm', 'sign', 'tag', 'fork', 'knife', 'belt', 'leg', 'paper', 'can', 'bottle', 'watch', 'suit', 'tie', 'bag', 'group', 'stick', 'purse', 'rug', 'camera', 'next']
2022-03-16 12:09:17,812.812 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'room', 'woman', 'hair', 'person', 'table', 'wall', 'chair', 'jean', 'shirt', 'ear', 'camera', 'plate', 'knife', 'birthday', 'flower', 'jacket', 'fork', 'cake', 'carpet', 'rug']
2022-03-16 12:11:41,525.525 2829:trainer.py:487 do_train_dict(): eta: 23:11:13  iter: 16100  speed: 303.2 images/sec  total_norm: 131.1263 (132.7787)  loss: 155.3250 (156.2789)  masked_loss: 1.7928 (1.7767)  tag_loss: 153.0792 (154.5023)  time: 1.4338 (1.6886)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4289 (1.6834)  save_time: 9.0279 (30.4322)  lr: 0.000076  max mem: 26307
2022-03-16 12:11:41,886.886 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 12:11:41,886.886 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 147.29998779296875
2022-03-16 12:11:41,886.886 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.80913713243272
2022-03-16 12:11:50,847.847 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01819959282875061
2022-03-16 12:11:50,847.847 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:11:50,847.847 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'male', 'tennis', 'player', '[MASK]', 'leaned', '[MASK]', 'in', 'a', 'position', 'preparing', 'for', 'the', '[MASK]', 'players', 'serve', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:11:50,863.863 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'tennis', 'shirt', 'man', 'shoe', 'court', 'short', 'head', 'ground', 'leg', 'hand', 'band', 'sock', 'hair', 'arm', 'player', 'handle', 'line', 'footprint', 'string', 'ball', 'ear', 'stripe', 'shadow', 'blue', 'dirt', 'face', 'wrist', 'sleeve', 'person', 'male', 'foot', 'outfit', 'hat', 'top', 'glove', 'track', 'net', 'tape', 'cap', 'knee', 'clay', 'ready', 'logo', 'letter', 'air', 'game', 'surface', 'action', 'shot']
2022-03-16 12:12:06,758.758 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'other', 'head', 'man', 'hand', 'band', 'player', 'court', 'short', 'position', 'ground', 'hair', 'track', 'arm', 'male', 'shirt', 'leg', 'handle', 'tennis', 'shoe', 'sock']
03-16 12:12:35.528 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 12:12:35.528 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 12:12:36.599 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 94}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 94}]
2022-03-16 12:14:30,343.343 2829:trainer.py:487 do_train_dict(): eta: 23:08:36  iter: 16200  speed: 303.3 images/sec  total_norm: 130.7571 (133.1061)  loss: 152.3263 (152.8713)  masked_loss: 1.6009 (1.6793)  tag_loss: 150.9995 (151.1920)  time: 1.4324 (1.6882)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4272 (1.6829)  save_time: 9.0279 (30.4322)  lr: 0.000076  max mem: 26307
2022-03-16 12:14:30,704.704 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-16 12:14:30,705.705 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 161.28192138671875
2022-03-16 12:14:30,705.705 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.80887898638204
2022-03-16 12:14:39,654.654 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018146023154258728
2022-03-16 12:14:39,654.654 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:14:39,654.654 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'and', '[MASK]', 'sitting', 'next', 'to', 'each', 'other', 'on', '[MASK]', 'park', 'bench', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:14:39,670.670 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'step', 'umbrella', 'shirt', 'ground', 'fence', 'person', 'window', 'bottle', 'jean', 'stair', 'man', 'grass', 'tree', 'woman', 'shoe', '[UNK]', 'bag', 'dog', 'bench', 'head', 'roof', 'hat', 'sidewalk', 'pole', 'hair', 'foot', 'book', 'couple', 'leg', 'hand', 'park', 'wall', 'cap', 'sky', 'shadow', 'curb', 'railing', 'dirt', 'watch', 'short', 'cup', 'paper', 'bird', 'backpack', 'rock', 'water', 'purse', 'lady', 'newspaper']
2022-03-16 12:14:55,621.621 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['other', 'city', 'man', 'hand', 'next', 'building', 'book', 'road', 'park', 'woman', 'ground', 'hair', 'person', 'couple', 'window', 'step', 'tree', 'watch', 'jean', 'shirt', 'leg', 'bag', 'shadow', 'palm', 'grass', 'bottle', 'pole', 'bench', 'fence', 'shoe', 'trash', 'sidewalk', 'umbrella', 'curb', 'stair']
2022-03-16 12:17:19,292.292 2829:trainer.py:487 do_train_dict(): eta: 23:05:59  iter: 16300  speed: 303.1 images/sec  total_norm: 133.7826 (137.7139)  loss: 152.2081 (153.8290)  masked_loss: 1.7153 (1.6968)  tag_loss: 150.0632 (152.1322)  time: 1.4327 (1.6895)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4273 (1.6842)  save_time: 9.0279 (30.4322)  lr: 0.000075  max mem: 26307
2022-03-16 12:17:19,653.653 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 12:17:19,653.653 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 145.068115234375
2022-03-16 12:17:19,653.653 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.83430718212593
2022-03-16 12:17:28,731.731 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018120521679520607
2022-03-16 12:17:28,732.732 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:17:28,732.732 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'blue', 'train', '[MASK]', 'outside', 'of', '[MASK]', 'train', 'station', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:17:28,748.748 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['train', 'sky', 'window', 'track', 'light', '[UNK]', 'platform', 'sidewalk', 'windshield', 'front', 'tree', 'gravel', 'logo', 'bumper', 'vent', 'pole', 'door', 'line', 'building', 'station', 'ground', 'roof', 'number', 'sign', 'car', 'person', 'fence', 'shirt', 'wire', 'man', 'cloud', 'blue', 'ladder', 'grass', 'engine', 'white', 'post', 'horn', 'stripe', 'bush', 'woman', 'writing', 'stop', 'railroad', 'bus', 'flag', 'wheel', 'passenger', 'wall', 'lamp']
2022-03-16 12:17:44,731.731 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'station', 'top', 'door', 'road', 'front', 'light', 'car', 'ground', 'blue', 'track', 'person', 'wall', 'window', 'train', 'sky', 'shirt', 'platform', 'pole', 'logo', 'fence', 'gravel', 'sidewalk', 'vent', 'bumper']
2022-03-16 12:20:08,210.210 2829:trainer.py:487 do_train_dict(): eta: 23:03:22  iter: 16400  speed: 303.1 images/sec  total_norm: 130.1776 (134.3516)  loss: 152.4139 (153.4830)  masked_loss: 1.7335 (1.7460)  tag_loss: 150.4457 (151.7370)  time: 1.4329 (1.6892)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4277 (1.6840)  save_time: 9.0279 (30.4322)  lr: 0.000075  max mem: 26307
2022-03-16 12:20:08,572.572 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4000000059604645
2022-03-16 12:20:08,572.572 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.09503173828125
2022-03-16 12:20:08,572.572 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.85548077207623
2022-03-16 12:20:17,604.604 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018132081255316734
2022-03-16 12:20:17,604.604 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:20:17,605.605 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'kite', 'flies', 'over', 'a', 'small', 'group', '[MASK]', '[MASK]', 'by', 'the', 'water', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:20:17,620.620 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'kite', 'tree', 'water', 'cloud', 'boat', 'person', 'tail', 'man', '[UNK]', 'beach', 'shirt', 'lake', 'shadow', 'string', 'sand', 'child', 'hat', 'grass', 'balloon', 'short', 'head', 'woman', 'building', 'house', 'shore', 'chair', 'group', 'jacket', 'umbrella', 'hair', 'ribbon', 'car', 'large', 'leg', 'boy', 'ground', 'distance', 'body', 'sail', 'air', 'colorful', 'ocean', 'forest', 'rope', 'day', 'rock', 'park', 'jean', 'bag']
2022-03-16 12:20:33,525.525 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['man', 'group', 'small', 'water', 'person', 'tree', 'beach', 'sky', 'boat', 'sand', 'tail', 'cloud', 'fence', 'log', 'ribbon', 'kite']
2022-03-16 12:22:57,176.176 2829:trainer.py:487 do_train_dict(): eta: 23:00:45  iter: 16500  speed: 303.0 images/sec  total_norm: 130.1838 (133.3996)  loss: 153.4620 (154.6140)  masked_loss: 1.7177 (1.7425)  tag_loss: 151.7218 (152.8715)  time: 1.4323 (1.6897)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4273 (1.6847)  save_time: 9.0279 (30.4322)  lr: 0.000075  max mem: 26307
2022-03-16 12:22:57,538.538 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5128205418586731
2022-03-16 12:22:57,538.538 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 168.87037658691406
2022-03-16 12:22:57,538.538 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.82732795807253
2022-03-16 12:23:06,664.664 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01812019571661949
2022-03-16 12:23:06,664.664 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:23:06,665.665 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'people', 'are', 'waiting', 'outside', '[MASK]', 'their', 'bicycles', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:23:06,680.680 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['person', 'shirt', 'hair', 'building', 'man', 'woman', 'hat', '[UNK]', 'backpack', 'jean', 'sign', 'window', 'bicycle', 'sky', 'sidewalk', 'cap', 'jacket', 'bike', 'head', 'street', 'bag', 'wall', 'short', 'shoe', 'tree', 'pole', 'glasses', 'hand', 'sweater', 'sunglasses', 'door', 'store', 'letter', 'group', 'shadow', 'skirt', 'road', 'tire', 'ground', 'car', 'girl', 'city', 'bottle', 'roof', 'line', 'purse', 'light', 'boy', 'helmet', 'boot']
2022-03-16 12:23:22,623.623 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'building', 'street', 'woman', 'hair', 'person', 'seat', 'arm', 'boy', 'phone', 'tree', 'letter', 'sign', 'sky', 'jean', 'shirt', 'palm', 'bottle', 'hat', 'cap', 'jacket', 'glasses', 'bike', 'boot', 'bicycle', 'basket', 'tire', 'backpack', 'sunglasses']
2022-03-16 12:25:46,282.282 2829:trainer.py:487 do_train_dict(): eta: 22:58:09  iter: 16600  speed: 302.8 images/sec  total_norm: 130.7999 (132.8385)  loss: 155.6601 (155.2399)  masked_loss: 1.8208 (1.8244)  tag_loss: 153.3624 (153.4155)  time: 1.4332 (1.6910)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0049)  time_gpu: 1.4281 (1.6856)  save_time: 9.0279 (30.4322)  lr: 0.000075  max mem: 26307
2022-03-16 12:25:46,643.643 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6285714507102966
2022-03-16 12:25:46,643.643 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 147.41220092773438
2022-03-16 12:25:46,643.643 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.84228118165524
2022-03-16 12:25:55,816.816 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0181451216340065
2022-03-16 12:25:55,816.816 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:25:55,816.816 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'police', '[MASK]', 'walking', 'two', 'bikes', 'down', 'the', 'street', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:25:55,831.831 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', '[UNK]', 'shirt', 'man', 'bike', 'bicycle', 'pot', 'shoe', 'umbrella', 'sidewalk', 'head', 'hand', 'flower', 'wall', 'plant', 'bag', 'woman', 'sign', 'basket', 'window', 'person', 'street', 'hat', 'door', 'wheel', 'hair', 'pipe', 'tire', 'pole', 'uniform', 'curb', 'strap', 'purse', 'road', 'apron', 'helmet', 'backpack', 'arm', 'belt', 'picture', 'jacket', 'leg', 'cap', 'poster', 'plate', 'brick', 'chair', 'city', 'light', 'flag']
2022-03-16 12:26:11,745.745 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'building', 'door', 'street', 'woman', 'hair', 'police', 'person', 'wall', 'arm', 'officer', 'plant', 'window', 'shirt', 'bag', 'wheel', 'hat', 'uniform', 'pole', 'flower', 'bike', 'pipe', 'pot', 'bicycle', 'basket', 'shoe', 'curtain', 'sidewalk', 'tire', 'umbrella', 'poster', 'curb', 'strap']
2022-03-16 12:28:35,604.604 2829:trainer.py:487 do_train_dict(): eta: 22:55:33  iter: 16700  speed: 302.4 images/sec  total_norm: 130.0176 (132.9274)  loss: 151.7409 (152.8994)  masked_loss: 1.6811 (1.7035)  tag_loss: 149.9599 (151.1959)  time: 1.4324 (1.6932)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4275 (1.6881)  save_time: 9.0279 (30.4322)  lr: 0.000075  max mem: 26307
2022-03-16 12:28:35,964.964 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-16 12:28:35,965.965 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 137.01580810546875
2022-03-16 12:28:35,965.965 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.85503101348877
2022-03-16 12:28:45,194.194 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018143733963370323
2022-03-16 12:28:45,195.195 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:28:45,195.195 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'close', 'up', 'of', '[MASK]', 'elephant', 'with', 'one', '[MASK]', 'open', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:28:45,210.210 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['elephant', 'ear', 'eye', 'head', 'trunk', 'face', 'skin', 'leg', 'body', '[UNK]', 'mouth', 'back', 'close', 'forehead', 'other', 'hair', 'line', 'wall', 'large', 'next', 'herd', 'adult', 'side', 'gray', 'rock', 'grass', 'tree', 'couple', 'camera', 'name', 'area', 'big', 'standing', 'picture', 'field', 'brown', 'water', 'small', 'green', 'open', 'ground', 'tongue', 'view', 'neck', 'grey', 'arm', 'tail', 'baby', 'pair', 'wild']
2022-03-16 12:29:01,092.092 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'open', 'skin', 'eye', 'ear', 'tail', 'trunk', 'elephant']
2022-03-16 12:31:24,721.721 2829:trainer.py:487 do_train_dict(): eta: 22:52:56  iter: 16800  speed: 302.8 images/sec  total_norm: 130.6526 (133.5937)  loss: 156.5758 (155.5780)  masked_loss: 1.7173 (1.7069)  tag_loss: 154.6486 (153.8711)  time: 1.4330 (1.6912)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4277 (1.6860)  save_time: 9.0279 (30.4322)  lr: 0.000075  max mem: 26307
2022-03-16 12:31:25,082.082 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.59375
2022-03-16 12:31:25,082.082 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 138.40505981445312
2022-03-16 12:31:25,083.083 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.85405857605342
2022-03-16 12:31:34,339.339 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01817173697054386
2022-03-16 12:31:34,339.339 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:31:34,340.340 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'man', 'standing', 'by', 'shore', 'in', 'ocean', '[MASK]', 'a', 'surf', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:31:34,355.355 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'water', 'hair', 'hand', 'head', 'man', 'sky', 'wave', 'arm', 'logo', 'beach', 'building', 'leg', 'board', 'suit', 'hill', 'house', 'wet', 'sand', 'foot', 'shore', 'face', 'person', 'rock', 'ocean', 'surfer', 'mountain', 'cliff', 'mouth', 'reflection', 'surf', 'strap', 'cord', 'cloud', 'tree', 'ear', 'background', 'foam', 'watch', 'design', 'grass', 'line', 'tower', 'rope', 'fin', 'nose', 'roof', 'stripe', 'shirt', 'shoe']
2022-03-16 12:31:50,340.340 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'line', 'water', 'building', 'board', 'hair', 'mouth', 'arm', 'mountain', 'tree', 'beach', 'sky', 'ocean', 'leg', 'wave', 'ear', 'suit', 'wet', 'shore', 'sand', 'logo', 'reflection', 'strap', 'foam']
2022-03-16 12:34:14,266.266 2829:trainer.py:487 do_train_dict(): eta: 22:50:20  iter: 16900  speed: 302.0 images/sec  total_norm: 131.1679 (133.4499)  loss: 151.9933 (152.8905)  masked_loss: 1.6240 (1.6615)  tag_loss: 150.5166 (151.2290)  time: 1.4344 (1.6955)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4290 (1.6902)  save_time: 9.0279 (30.4322)  lr: 0.000075  max mem: 26307
2022-03-16 12:34:14,625.625 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4571428596973419
2022-03-16 12:34:14,626.626 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 167.4562225341797
2022-03-16 12:34:14,626.626 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.8652432161219
2022-03-16 12:34:23,882.882 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018175462260842323
2022-03-16 12:34:23,882.882 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:34:23,882.882 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'sail', '##boat', 'ic', '##ome', '##s', 'close', 'to', '[MASK]', '[MASK]', 'people', 'can', 'see', 'each', 'other', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:34:23,897.897 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['water', 'person', 'sky', 'boat', 'rock', 'sail', 'bush', 'shirt', '[UNK]', 'reflection', 'beach', 'man', 'bottom', 'shore', 'number', 'tree', 'ocean', 'wave', 'mast', 'woman', 'top', 'head', 'shoreline', 'small', 'deck', 'motor', 'ground', 'wake', 'lake', 'grass', 'horizon', 'couple', 'land', 'bird', 'front', 'ripple', 'cross', 'hat', 'pole', 'group', 'body', 'mountain', 'white', 'sand', 'flag', 'hair', 'base', 'dirt', 'jacket', 'window']
2022-03-16 12:34:39,869.869 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'other', 'man', 'number', 'water', 'ground', 'rock', 'person', 'tree', 'beach', 'sky', 'shirt', 'bottom', 'boat', 'ocean', 'wave', 'bush', 'reflection', 'sail', 'mast', 'kite']
2022-03-16 12:37:03,559.559 2829:trainer.py:487 do_train_dict(): eta: 22:47:44  iter: 17000  speed: 302.4 images/sec  total_norm: 129.8390 (134.4609)  loss: 153.5485 (154.0217)  masked_loss: 1.7482 (1.7960)  tag_loss: 152.1056 (152.2257)  time: 1.4337 (1.6929)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4287 (1.6877)  save_time: 9.0279 (30.4322)  lr: 0.000074  max mem: 26307
2022-03-16 12:37:03,919.919 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5789473652839661
2022-03-16 12:37:03,919.919 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 161.56826782226562
2022-03-16 12:37:03,919.919 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.8867988363344
2022-03-16 12:37:13,569.569 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018235348165035248
2022-03-16 12:37:13,570.570 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:37:13,570.570 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'three', 'men', '[MASK]', 'the', '[MASK]', 'with', '[MASK]', 'object', 'sailing', 'through', 'the', 'air', 'between', 'two', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:37:13,586.586 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'hand', '[UNK]', 'grass', 'shirt', 'arm', 'hat', 'man', 'leg', 'ground', 'cap', 'head', 'sky', 'shoe', 'jacket', 'person', 'bush', 'rock', 'forest', 'wood', 'path', 'woman', 'hill', 'face', 'hair', 'sunglasses', 'short', 'trail', 'dirt', 'sweatshirt', 'glasses', 'backpack', 'trunk', 'area', 'sleeve', 'sweater', 'boot', 'plant', 'bag', 'foot', 'jean', 'stick', 'pole', 'young', 'water', 'field', 'watch', 'air', 'branch', 'side']
2022-03-16 12:37:29,527.527 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'air', 'woman', 'ground', 'rock', 'hair', 'person', 'arm', 'tree', 'watch', 'sky', 'shirt', 'path', 'leg', 'ear', 'object', 'grass', 'hat', 'cap', 'jacket', 'wrist', 'glasses', 'shoe']
2022-03-16 12:39:52,995.995 2829:trainer.py:487 do_train_dict(): eta: 22:45:07  iter: 17100  speed: 302.2 images/sec  total_norm: 132.8160 (137.0210)  loss: 155.5006 (154.4930)  masked_loss: 1.7396 (1.7455)  tag_loss: 153.5885 (152.7476)  time: 1.4322 (1.6944)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4271 (1.6893)  save_time: 9.0279 (30.4322)  lr: 0.000074  max mem: 26307
2022-03-16 12:39:53,356.356 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-16 12:39:53,357.357 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 149.45701599121094
2022-03-16 12:39:53,357.357 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.9060405908629
2022-03-16 12:40:02,738.738 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018234478309750557
2022-03-16 12:40:02,738.738 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:40:02,738.738 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'sitting', 'on', 'top', 'of', 'a', '[MASK]', 'ledge', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:40:02,753.753 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'wheel', 'hand', 'man', 'rock', 'arm', 'ground', 'ear', 'hair', 'head', 'leg', 'person', 'board', 'wall', 'shoe', 'poster', 'face', 'shadow', 'shirt', 'bracelet', 'sheep', 'tree', 'nose', 'short', 'back', 'tattoo', 'road', 'water', 'background', 'wrist', 'foot', 'mouth', 'beach', 'skate', 'picture', 'leaf', 'glasses', 'young', 'strap', 'top', 'logo', 'hat', 'sidewalk', 'eye', 'sunglasses', 'car', 'band', 'grass', 'jean', 'design']
2022-03-16 12:40:18,679.679 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'building', 'top', 'ground', 'rock', 'wall', 'arm', 'paper', 'shirt', 'animal', 'leg', 'nose', 'ear', 'camera', 'wheel', 'hat', 'cap', 'glasses', 'sheep', 'cement', 'strap', 'ledge', 'bracelet']
03-16 12:42:36.697 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 12:42:36.697 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 12:42:38.017 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 97}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 12:42:42,089.089 2829:trainer.py:487 do_train_dict(): eta: 22:42:30  iter: 17200  speed: 302.8 images/sec  total_norm: 134.4014 (136.8270)  loss: 152.6346 (152.4155)  masked_loss: 1.7752 (1.7952)  tag_loss: 151.0208 (150.6203)  time: 1.4315 (1.6909)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4262 (1.6858)  save_time: 9.0279 (30.4322)  lr: 0.000074  max mem: 26307
2022-03-16 12:42:42,453.453 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 12:42:42,453.453 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 175.4453125
2022-03-16 12:42:42,453.453 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.89723540730559
2022-03-16 12:42:51,899.899 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018246429041028023
2022-03-16 12:42:51,899.899 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:42:51,899.899 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'cat', 'looking', 'angry', 'as', 'it', 'is', '[MASK]', 'on', 'top', 'of', 'a', 'laptop', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:42:51,914.914 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cat', 'keyboard', 'ear', 'head', 'eye', 'laptop', '[UNK]', 'paw', 'key', 'nose', 'face', 'screen', 'computer', 'desk', 'black', 'button', 'table', 'wall', 'leg', 'paper', 'cord', 'tail', 'logo', 'mouse', 'person', 'top', 'pad', 'floor', 'book', 'next', 'light', 'white', 'cloth', 'carpet', 'bed', 'monitor', 'foot', 'chair', 'front', 'writing', 'kitten', 'speaker', 'pen', 'bottle', 'fur', 'window', 'shelf', 'bag', 'animal', 'lap']
2022-03-16 12:43:07,852.852 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'top', 'cup', 'person', 'table', 'phone', 'key', 'eye', 'chair', 'paper', 'cell', 'shirt', 'screen', 'nose', 'bag', 'ear', 'desk', 'angry', 'cat', 'tail', 'bottle', 'keyboard', 'cord', 'laptop']
2022-03-16 12:45:31,236.236 2829:trainer.py:487 do_train_dict(): eta: 22:39:52  iter: 17300  speed: 302.7 images/sec  total_norm: 132.8506 (135.2566)  loss: 155.6959 (154.1799)  masked_loss: 1.6586 (1.7094)  tag_loss: 154.3548 (152.4705)  time: 1.4321 (1.6916)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4270 (1.6864)  save_time: 9.0279 (30.4322)  lr: 0.000074  max mem: 26307
2022-03-16 12:45:31,596.596 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5714285969734192
2022-03-16 12:45:31,597.597 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.93707275390625
2022-03-16 12:45:31,597.597 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.91809915126055
2022-03-16 12:45:41,103.103 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01827346906065941
2022-03-16 12:45:41,103.103 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:45:41,104.104 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'young', 'man', 'holding', 'a', '[MASK]', 'knife', '[MASK]', '[MASK]', 'right', 'hand', 'is', 'posing', 'for', 'a', 'snap', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:45:41,119.119 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['eye', 'ear', 'tie', 'man', 'face', 'nose', 'hair', 'shirt', 'hand', 'head', 'wall', 'collar', 'teeth', 'suit', 'mouth', 'blade', 'knife', 'chin', 'eyebrow', 'smile', 'finger', 'jacket', 'neck', 'handle', 'scissors', '[UNK]', 'arm', 'ring', 'forehead', 'sleeve', 'knot', 'stripe', 'wrist', 'watch', 'thumb', 'coat', 'cuff', 'paper', 'white', 'button', 'blue', 'black', 'smiling', 'guy', 'picture', 'sword', 'young', 'dot', 'person', 'woman']
2022-03-16 12:45:57,102.102 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'right', 'man', 'hand', 'face', 'young', 'hair', 'mouth', 'wall', 'arm', 'smile', 'eye', 'neck', 'shirt', 'teeth', 'finger', 'nose', 'ear', 'sharp', 'suit', 'chin', 'knife', 'tie', 'collar', 'snap', 'knot']
2022-03-16 12:48:20,618.618 2829:trainer.py:487 do_train_dict(): eta: 22:37:15  iter: 17400  speed: 302.3 images/sec  total_norm: 133.6412 (135.6275)  loss: 153.5819 (154.5017)  masked_loss: 1.7322 (1.7640)  tag_loss: 152.0323 (152.7377)  time: 1.4331 (1.6937)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4278 (1.6886)  save_time: 9.0279 (30.4322)  lr: 0.000074  max mem: 26307
2022-03-16 12:48:20,979.979 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.59375
2022-03-16 12:48:20,979.979 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 169.4862060546875
2022-03-16 12:48:20,979.979 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.90597978864398
2022-03-16 12:48:30,443.443 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018285930156707764
2022-03-16 12:48:30,443.443 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:48:30,443.443 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'benches', 'are', 'facing', '[MASK]', 'water', 'surrounded', 'by', 'leaves', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:48:30,458.458 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'leaf', 'ground', 'park', 'bench', 'water', 'reflection', 'trunk', 'grass', 'leg', 'curb', 'pond', '[UNK]', 'puddle', 'lake', 'stream', 'area', 'sky', 'arm', 'forest', 'red', 'branch', 'pole', 'road', 'person', 'pool', 'flower', 'wall', 'seat', 'back', 'rock', 'bank', 'background', 'wooden', 'fence', 'path', 'light', 'foliage', 'fall', 'couple', 'head', 'pavement', 'top', 'trash', 'next', 'middle', 'sign', 'paint', 'bush', 'front']
2022-03-16 12:48:46,400.400 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'water', 'park', 'ground', 'arm', 'tree', 'leg', 'background', 'grass', 'bench', 'leaf', 'trunk', 'reflection', 'curb']
2022-03-16 12:51:09,802.802 2829:trainer.py:487 do_train_dict(): eta: 22:34:38  iter: 17500  speed: 302.6 images/sec  total_norm: 130.7618 (133.5605)  loss: 151.8541 (152.2430)  masked_loss: 1.7109 (1.7954)  tag_loss: 150.0105 (150.4476)  time: 1.4323 (1.6918)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4272 (1.6868)  save_time: 9.0279 (30.4322)  lr: 0.000074  max mem: 26307
2022-03-16 12:51:10,163.163 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5151515007019043
2022-03-16 12:51:10,163.163 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 161.96124267578125
2022-03-16 12:51:10,164.164 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.90612610903653
2022-03-16 12:51:19,676.676 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018312757834792137
2022-03-16 12:51:19,677.677 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:51:19,677.677 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'vehicles', 'and', '[MASK]', '[MASK]', 'busy', 'urban', 'city', 'setting', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:51:19,693.693 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'street', 'man', 'bus', 'person', '[UNK]', 'road', 'sidewalk', 'pole', 'motorcycle', 'sign', 'license', 'windshield', 'plate', 'car', 'window', 'van', 'number', 'jacket', 'helmet', 'city', 'shirt', 'tire', 'light', 'woman', 'bag', 'line', 'traffic', 'mirror', 'busy', 'bike', 'trash', 'backpack', 'can', 'decker', 'jean', 'vest', 'truck', 'shoe', 'coat', 'double', 'driver', 'curb', 'vehicle', 'hair', 'tree', 'sky', 'head', 'stop', 'purse']
2022-03-16 12:51:35,599.599 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'man', 'number', 'building', 'road', 'street', 'woman', 'car', 'person', 'van', 'window', 'sign', 'shirt', 'bus', 'urban', 'setting', 'bag', 'plate', 'busy', 'license', 'pole', 'jacket', 'bike', 'motorcycle', 'helmet', 'shoe', 'sidewalk', 'tire', 'windshield']
2022-03-16 12:53:59,165.165 2829:trainer.py:487 do_train_dict(): eta: 22:32:00  iter: 17600  speed: 302.3 images/sec  total_norm: 133.8442 (135.4127)  loss: 149.4632 (150.1285)  masked_loss: 1.7335 (1.7827)  tag_loss: 147.5871 (148.3459)  time: 1.4327 (1.6936)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4275 (1.6886)  save_time: 9.0279 (30.4322)  lr: 0.000073  max mem: 26307
2022-03-16 12:53:59,525.525 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-16 12:53:59,526.526 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 149.092041015625
2022-03-16 12:53:59,526.526 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.9159889436711
2022-03-16 12:54:09,065.065 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018323233351111412
2022-03-16 12:54:09,066.066 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:54:09,066.066 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'large', '[MASK]', 'of', 'people', 'riding', 'motorized', 'bicycles', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:54:09,082.082 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['person', 'motorcycle', 'man', 'tree', 'shirt', 'building', 'bike', 'street', 'car', 'helmet', 'light', 'road', 'tire', '[UNK]', 'woman', 'sign', 'hat', 'group', 'wheel', 'head', 'window', 'city', 'short', 'bicycle', 'pole', 'hair', 'sidewalk', 'bush', 'traffic', 'jacket', 'night', 'dress', 'shoe', 'line', 'suv', 'license', 'bag', 'mirror', 'parade', 'arm', 'sunglasses', 'busy', 'sky', 'balcony', 'bunch', 'hand', 'backpack', 'banner', 'truck', 'background']
2022-03-16 12:54:24,975.975 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['man', 'group', 'building', 'large', 'top', 'road', 'street', 'light', 'woman', 'cup', 'short', 'car', 'person', 'tree', 'shirt', 'tank', 'wheel', 'hat', 'bike', 'barrel', 'motorcycle', 'helmet', 'shoe', 'sidewalk', 'tire', 'cone', 'backpack', 'curb', 'motorized']
2022-03-16 12:56:48,669.669 2829:trainer.py:487 do_train_dict(): eta: 22:29:23  iter: 17700  speed: 302.1 images/sec  total_norm: 132.1799 (134.3531)  loss: 152.4666 (152.4722)  masked_loss: 1.6963 (1.7527)  tag_loss: 150.6600 (150.7195)  time: 1.4339 (1.6951)  data: 0.0001 (0.0005)  to_device: 0.0050 (0.0048)  time_gpu: 1.4288 (1.6897)  save_time: 9.0279 (30.4322)  lr: 0.000073  max mem: 26307
2022-03-16 12:56:49,029.029 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.65625
2022-03-16 12:56:49,029.029 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 133.92584228515625
2022-03-16 12:56:49,029.029 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.93238937720824
2022-03-16 12:56:58,642.642 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018357407301664352
2022-03-16 12:56:58,642.642 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:56:58,643.643 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'parked', '[MASK]', 'sitting', 'next', 'to', 'a', 'white', 'brick', 'wall', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:56:58,658.658 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'mirror', 'motorcycle', 'bike', '[UNK]', 'hand', 'building', 'light', 'finger', 'handle', 'person', 'reflection', 'brick', 'door', 'man', 'window', 'seat', 'camera', 'front', 'shirt', 'button', 'helmet', 'nail', 'arm', 'jacket', 'pipe', 'wheel', 'glass', 'shadow', 'thumb', 'head', 'watch', 'sleeve', 'street', 'ceiling', 'side', 'road', 'pole', 'face', 'ring', 'tank', 'red', 'view', 'frame', 'wrist', 'top', 'hair', 'line', 'license', 'gas']
2022-03-16 12:57:14,670.670 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'side', 'line', 'next', 'building', 'white', 'door', 'light', 'wall', 'glass', 'handle', 'mirror', 'brick', 'pole', 'bike', 'motorcycle', 'parked']
2022-03-16 12:59:38,123.123 2829:trainer.py:487 do_train_dict(): eta: 22:26:46  iter: 17800  speed: 302.2 images/sec  total_norm: 130.6366 (134.1097)  loss: 151.3086 (150.3721)  masked_loss: 1.6658 (1.6862)  tag_loss: 149.1745 (148.6859)  time: 1.4326 (1.6945)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4275 (1.6895)  save_time: 9.0279 (30.4322)  lr: 0.000073  max mem: 26307
2022-03-16 12:59:38,482.482 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-16 12:59:38,483.483 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 174.25140380859375
2022-03-16 12:59:38,483.483 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.9307282090853
2022-03-16 12:59:48,142.142 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018369663506746292
2022-03-16 12:59:48,143.143 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 12:59:48,143.143 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'group', 'of', 'women', '[MASK]', 'next', 'to', 'each', 'other', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 12:59:48,158.158 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'shirt', 'man', 'head', 'person', 'cake', 'hand', 'woman', '[UNK]', 'shelf', 'table', 'cup', 'phone', 'plate', 'ear', 'face', 'napkin', 'girl', 'arm', 'kitchen', 'cell', 'hat', 'glasses', 'sweater', 'picture', 'bowl', 'laptop', 'apron', 'glass', 'light', 'bottle', 'screen', 'restaurant', 'food', 'camera', 'mouth', 'logo', 'design', 'top', 'nose', 'spoon', 'lady', 'beard', 'towel', 'wall', 'ceiling', 'pot', 'chair', 'oven', 'container']
2022-03-16 13:00:04,096.096 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'other', 'head', 'man', 'group', 'hand', 'next', 'woman', 'cup', 'hair', 'girl', 'person', 'table', 'arm', 'phone', 'shirt', 'drink', 'plate', 'bottle', 'pan', 'cloth', 'cake', 'shelf', 'laptop', 'sweater', 'scarf', 'napkin']
2022-03-16 13:02:27,583.583 2829:trainer.py:487 do_train_dict(): eta: 22:24:09  iter: 17900  speed: 302.1 images/sec  total_norm: 132.8371 (135.1002)  loss: 153.2622 (153.0360)  masked_loss: 1.7467 (1.7355)  tag_loss: 151.5882 (151.3004)  time: 1.4324 (1.6946)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4271 (1.6894)  save_time: 9.0279 (30.4322)  lr: 0.000073  max mem: 26307
2022-03-16 13:02:27,944.944 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6774193644523621
2022-03-16 13:02:27,945.945 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 167.03985595703125
2022-03-16 13:02:27,945.945 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.92331682840982
2022-03-16 13:02:37,697.697 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018341748043894768
2022-03-16 13:02:37,698.698 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:02:37,698.698 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'public', 'bus', 'stopping', 'with', 'mermaid', "'", 's', 'doors', 'open', 'at', 'night', 'time', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:02:37,713.713 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['light', 'window', 'windshield', '[UNK]', 'sign', 'road', 'ceiling', 'bus', 'front', 'street', 'number', 'door', 'bumper', 'sidewalk', 'wheel', 'building', 'tire', 'line', 'roof', 'stripe', 'sky', 'pole', 'car', 'plate', 'license', 'station', 'curb', 'night', 'rack', 'logo', 'person', 'bike', 'track', 'advertisement', 'mirror', 'wall', 'ground', 'man', 'snow', 'letter', 'fence', 'shadow', 'train', 'driver', 'pillar', 'top', 'city', 'shirt', 'tree', 'bicycle']
2022-03-16 13:02:53,712.712 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'time', 'number', 'line', 'public', 'night', 'open', 'door', 'road', 'front', 'street', 'light', 'wall', 'window', 'sign', 'bus', 'snow', 'plate', 'wheel', 'ceiling', 'license', 'sidewalk', 'tire', 'rack', 'windshield']
2022-03-16 13:05:17,174.174 2829:trainer.py:487 do_train_dict(): eta: 22:21:31  iter: 18000  speed: 301.9 images/sec  total_norm: 133.8943 (139.1579)  loss: 151.1933 (152.4651)  masked_loss: 1.6574 (1.6425)  tag_loss: 149.9055 (150.8227)  time: 1.4328 (1.6959)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4278 (1.6908)  save_time: 9.0279 (30.4322)  lr: 0.000073  max mem: 26307
2022-03-16 13:05:17,536.536 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-16 13:05:17,537.537 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.84056091308594
2022-03-16 13:05:17,537.537 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.93303950188569
2022-03-16 13:05:27,313.313 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01839062198996544
2022-03-16 13:05:27,313.313 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:05:27,314.314 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'toilet', 'is', 'next', 'to', 'a', 'curtain', '[MASK]', 'window', '.', 'factories', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:05:27,329.329 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['curtain', 'wall', 'window', 'towel', 'bathroom', '[UNK]', 'floor', 'toilet', 'door', 'seat', 'rod', 'light', 'sink', 'shower', 'cabinet', 'rack', 'knob', 'lid', 'holder', 'bottle', 'handle', 'white', 'room', 'hook', 'rug', 'tile', 'tub', 'paper', 'drawer', 'mirror', 'ceiling', 'frame', 'can', 'picture', 'small', 'tank', 'mat', 'shelf', 'lamp', 'ring', 'black', 'outlet', 'open', 'basket', 'vent', 'bag', 'fixture', 'trash', 'table', 'box']
2022-03-16 13:05:43,208.208 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'next', 'building', 'door', 'light', 'floor', 'wall', 'seat', 'paper', 'window', 'handle', 'bathroom', 'sink', 'holder', 'towel', 'curtain', 'toilet', 'sweater', 'knob', 'vent']
2022-03-16 13:08:06,736.736 2829:trainer.py:487 do_train_dict(): eta: 22:18:54  iter: 18100  speed: 302.0 images/sec  total_norm: 129.3557 (132.3418)  loss: 153.9916 (155.1580)  masked_loss: 1.6476 (1.6790)  tag_loss: 152.1224 (153.4790)  time: 1.4326 (1.6956)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4273 (1.6903)  save_time: 9.0279 (30.4322)  lr: 0.000073  max mem: 26307
2022-03-16 13:08:07,097.097 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 13:08:07,097.097 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.42556762695312
2022-03-16 13:08:07,097.097 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.9396828871507
2022-03-16 13:08:16,978.978 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01840473897755146
2022-03-16 13:08:16,979.979 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:08:16,979.979 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'clock', 'tower', '[MASK]', 'union', 'station', 'lights', 'up', 'in', 'the', 'evening', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:08:16,994.994 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['flag', 'sky', 'light', 'clock', 'tower', 'pole', 'lamp', 'building', 'window', 'sign', 'hand', 'letter', 'balcony', 'american', 'street', '[UNK]', 'night', 'top', 'roof', 'railing', 'wall', 'blue', 'ball', 'word', 'post', 'tree', 'ring', 'traffic', 'lit', 'large', 'city', 'pillar', 'number', 'tall', 'front', 'wire', 'band', 'antenna', 'dome', 'face', 'brick', 'time', 'background', 'bell', 'spire', 'side', 'red', 'green', 'line', 'column']
2022-03-16 13:08:32,924.924 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'american', 'hand', 'light', 'window', 'tower', 'letter', 'sky', 'evening', 'clock', 'flag', 'pole', 'wire', 'lamp', 'balcony']
2022-03-16 13:10:56,421.421 2829:trainer.py:487 do_train_dict(): eta: 22:16:17  iter: 18200  speed: 301.7 images/sec  total_norm: 137.3312 (138.9078)  loss: 151.8200 (152.5950)  masked_loss: 1.6154 (1.6629)  tag_loss: 150.0909 (150.9321)  time: 1.4334 (1.6969)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4282 (1.6918)  save_time: 9.0279 (30.4322)  lr: 0.000073  max mem: 26307
2022-03-16 13:10:56,784.784 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6571428775787354
2022-03-16 13:10:56,784.784 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 150.60134887695312
2022-03-16 13:10:56,784.784 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.95259657062468
2022-03-16 13:11:06,639.639 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018465938046574593
2022-03-16 13:11:06,640.640 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:11:06,640.640 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'is', '[MASK]', 'at', 'a', 'very', 'large', 'carrot', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:11:06,656.656 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['eye', 'nose', 'man', 'wall', 'hand', 'ear', 'lamp', 'head', 'carrot', 'finger', 'mouth', 'face', 'shirt', 'eyebrow', '[UNK]', 'lip', 'shade', 'nail', 'window', 'hair', 'picture', 'chin', 'plant', 'forehead', 'door', 'curtain', 'frame', 'sweater', 'beard', 'jacket', 'orange', 'thumb', 'handle', 'couch', 'ceiling', 'table', 'mustache', 'button', 'shelf', 'cup', 'neck', 'cabinet', 'arm', 'vase', 'chair', 'sleeve', 'stem', 'ring', 'television', 'front']
2022-03-16 13:11:22,707.707 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'large', 'mouth', 'wall', 'eye', 'shirt', 'picture', 'finger', 'nose', 'ear', 'frame', 'mirror', 'lip', 'couch', 'shade', 'eyebrow', 'pillow', 'beard', 'lamp', 'sofa', 'nail', 'vase', 'carrot']
03-16 13:12:38.029 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 13:12:38.029 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 13:12:39.242 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 13:13:46,228.228 2829:trainer.py:487 do_train_dict(): eta: 22:13:40  iter: 18300  speed: 301.5 images/sec  total_norm: 134.2271 (139.2529)  loss: 153.4698 (156.0047)  masked_loss: 1.6882 (1.6909)  tag_loss: 151.8878 (154.3138)  time: 1.4343 (1.6981)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4293 (1.6929)  save_time: 9.0279 (30.4322)  lr: 0.000072  max mem: 26307
2022-03-16 13:13:46,589.589 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6969696879386902
2022-03-16 13:13:46,589.589 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 136.46722412109375
2022-03-16 13:13:46,589.589 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.97382213758385
2022-03-16 13:13:56,464.464 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018461301922798157
2022-03-16 13:13:56,464.464 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:13:56,465.465 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'driving', 'down', 'a', 'street', 'near', 'a', 'business', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:13:56,480.480 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'sky', 'sign', 'tree', 'street', 'pole', 'light', 'sidewalk', 'window', 'car', '[UNK]', 'person', 'road', 'lamp', 'city', 'man', 'store', 'line', 'shirt', 'woman', 'post', 'wall', 'jacket', 'roof', 'arrow', 'cloud', 'jean', 'clock', 'can', 'curb', 'banner', 'bus', 'fence', 'van', 'hair', 'fire', 'truck', 'bicycle', 'bag', 'door', 'traffic', 'flower', 'trash', 'plate', 'flag', 'circle', 'plant', 'license', 'bike', 'tire']
2022-03-16 13:14:12,384.384 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['building', 'road', 'street', 'light', 'business', 'car', 'post', 'person', 'wall', 'window', 'tree', 'sign', 'sky', 'wheel', 'bush', 'cloud', 'pole', 'meter', 'lamp', 'sidewalk', 'tire']
2022-03-16 13:16:36,117.117 2829:trainer.py:487 do_train_dict(): eta: 22:11:03  iter: 18400  speed: 301.4 images/sec  total_norm: 132.8894 (136.5378)  loss: 152.1600 (152.8613)  masked_loss: 1.6626 (1.7019)  tag_loss: 150.3112 (151.1594)  time: 1.4329 (1.6989)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4279 (1.6939)  save_time: 9.0279 (30.4322)  lr: 0.000072  max mem: 26307
2022-03-16 13:16:36,480.480 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-16 13:16:36,481.481 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 128.72280883789062
2022-03-16 13:16:36,481.481 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.99499738538587
2022-03-16 13:16:46,443.443 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018462149426341057
2022-03-16 13:16:46,443.443 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:16:46,443.443 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'tennis', 'player', 'serves', 'a', 'ball', 'during', 'a', 'tennis', 'game', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:16:46,458.458 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shoe', 'court', '[UNK]', 'tennis', 'shadow', 'man', 'sock', 'shirt', 'short', 'hand', 'leg', 'line', 'head', 'wall', 'arm', 'hair', 'ball', 'person', 'ground', 'hat', 'player', 'letter', 'logo', 'cap', 'sign', 'fence', 'camera', 'woman', 'knee', 'chair', 'stand', 'handle', 'pole', 'banner', 'advertisement', 'skirt', 'stripe', 'male', 'outfit', 'band', 'match', 'sunglasses', 'air', 'top', 'flower', 'face', 'light', 'boy', 'clock', 'crowd']
2022-03-16 13:17:02,432.432 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'game', 'line', 'player', 'court', 'short', 'ground', 'hair', 'person', 'wall', 'arm', 'ball', 'letter', 'shirt', 'leg', 'camera', 'tennis', 'shadow', 'fan', 'hat', 'globe', 'shoe', 'sock']
2022-03-16 13:19:25,965.965 2829:trainer.py:487 do_train_dict(): eta: 22:08:25  iter: 18500  speed: 301.4 images/sec  total_norm: 134.1048 (137.8646)  loss: 151.1212 (151.3907)  masked_loss: 1.6932 (1.6862)  tag_loss: 149.7786 (149.7045)  time: 1.4334 (1.6984)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4284 (1.6932)  save_time: 9.0279 (30.4322)  lr: 0.000072  max mem: 26307
2022-03-16 13:19:26,327.327 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-16 13:19:26,328.328 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 179.7282257080078
2022-03-16 13:19:26,328.328 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.99253123293641
2022-03-16 13:19:36,340.340 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018474362790584564
2022-03-16 13:19:36,340.340 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:19:36,340.340 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'pitcher', 'with', 'his', 'arm', 'up', 'ancestors', 'ready', 'to', 'throw', 'a', 'ball', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:19:36,356.356 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'head', 'shirt', 'glove', 'baseball', 'ear', 'hat', 'nose', 'logo', 'face', 'arm', 'cap', 'hand', 'necklace', '[UNK]', 'eye', 'mouth', 'ball', 'belt', 'player', 'grass', 'hair', 'letter', 'number', 'field', 'jersey', 'tree', 'neck', 'stripe', 'fence', 'finger', 'sleeve', 'leg', 'uniform', 'buckle', 'ground', 'pitcher', 'pitch', 'mound', 'pole', 'dirt', 'wall', 'elbow', 'background', 'writing', 'shadow', 'wrist', 'person', 'net', 'top']
2022-03-16 13:19:52,263.263 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'number', 'face', 'field', 'mouth', 'arm', 'ready', 'eye', 'baseball', 'ball', 'letter', 'shirt', 'jersey', 'finger', 'nose', 'ear', 'grass', 'bush', 'hat', 'cap', 'pitcher', 'logo', 'fence', 'sleeve', 'necklace', 'glove', 'stripe']
2022-03-16 13:22:15,820.820 2829:trainer.py:487 do_train_dict(): eta: 22:05:48  iter: 18600  speed: 301.4 images/sec  total_norm: 133.3804 (136.7868)  loss: 153.2503 (157.0127)  masked_loss: 1.7232 (1.7141)  tag_loss: 151.7137 (155.2986)  time: 1.4333 (1.6986)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4281 (1.6935)  save_time: 9.0279 (30.4322)  lr: 0.000072  max mem: 26307
2022-03-16 13:22:16,180.180 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5714285969734192
2022-03-16 13:22:16,180.180 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 157.94696044921875
2022-03-16 13:22:16,180.180 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.98324886872807
2022-03-16 13:22:26,186.186 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018502885475754738
2022-03-16 13:22:26,187.187 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:22:26,187.187 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'kite', 'is', 'seen', '[MASK]', 'high', 'on', 'a', '[MASK]', 'day', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:22:26,202.202 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'water', 'airplane', 'ocean', 'beach', 'kite', '[UNK]', 'person', 'tail', 'wave', 'wing', 'boat', 'jet', 'plane', 'air', 'man', 'shirt', 'sand', 'horizon', 'cloud', 'arm', 'bird', 'sun', 'large', 'hair', 'couple', 'head', 'chair', 'shore', 'blue', 'object', 'body', 'jacket', 'day', 'high', 'pole', 'cloudy', 'leg', 'small', 'woman', 'next', 'ship', 'clear', 'short', 'light', 'sea', 'hand', 'top', 'tree', 'low']
2022-03-16 13:22:42,095.095 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['high', 'day', 'water', 'person', 'couple', 'beach', 'sky', 'shirt', 'ocean', 'airplane', 'kite', 'cloudy']
2022-03-16 13:25:05,687.687 2829:trainer.py:487 do_train_dict(): eta: 22:03:11  iter: 18700  speed: 301.4 images/sec  total_norm: 132.8717 (135.4333)  loss: 152.6264 (154.7283)  masked_loss: 1.7588 (1.7485)  tag_loss: 151.3432 (152.9798)  time: 1.4325 (1.6987)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4275 (1.6936)  save_time: 9.0279 (30.4322)  lr: 0.000072  max mem: 26307
2022-03-16 13:25:06,049.049 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 13:25:06,050.050 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.09976196289062
2022-03-16 13:25:06,050.050 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.97958467361775
2022-03-16 13:25:16,158.158 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018503960222005844
2022-03-16 13:25:16,158.158 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:25:16,159.159 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'black', 'and', 'white', 'photograph', 'of', 'a', 'person', '[MASK]', 'bicycle', 'turning', 'onto', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:25:16,174.174 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['road', 'line', 'light', 'street', 'sign', 'night', 'car', 'pole', '[UNK]', 'sky', 'tree', 'photo', 'ground', 'highway', 'picture', 'sidewalk', 'shadow', 'photograph', 'building', 'white', 'track', 'background', 'dark', 'side', 'lane', 'black', 'reflection', 'traffic', 'number', 'rail', 'arrow', 'median', 'wheel', 'person', 'window', 'railroad', 'meter', 'vehicle', 'image', 'wall', 'fence', 'train', 'letter', 'front', 'arm', 'object', 'empty', 'mirror', 'snow', 'top']
2022-03-16 13:25:32,130.130 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'line', 'white', 'road', 'light', 'car', 'hair', 'person', 'sign', 'shirt', 'wheel', 'pole', 'bike', 'photograph', 'bicycle', 'tire', 'backpack', 'stripe']
2022-03-16 13:27:55,518.518 2829:trainer.py:487 do_train_dict(): eta: 22:00:33  iter: 18800  speed: 301.5 images/sec  total_norm: 134.5318 (136.1979)  loss: 149.8090 (152.3738)  masked_loss: 1.7205 (1.7096)  tag_loss: 148.0998 (150.6642)  time: 1.4322 (1.6983)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4272 (1.6931)  save_time: 9.0279 (30.4322)  lr: 0.000072  max mem: 26307
2022-03-16 13:27:55,878.878 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-16 13:27:55,879.879 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 155.90335083007812
2022-03-16 13:27:55,879.879 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.97973515747717
2022-03-16 13:28:06,044.044 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01851644739508629
2022-03-16 13:28:06,044.044 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:28:06,045.045 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'flat', 'tv', 'screen', '[MASK]', 'on', 'top', '[MASK]', 'a', 'book', 'shelf', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:28:06,060.060 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['television', 'wall', 'shelf', 'book', 'floor', 'room', '[UNK]', 'railing', 'vent', 'stair', 'chair', 'screen', 'staircase', 'picture', 'carpet', 'rug', 'light', 'stand', 'building', 'speaker', 'living', 'center', 'hair', 'rail', 'window', 'entertainment', 'player', 'ceiling', 'game', 'clock', 'man', 'cabinet', 'boy', 'door', 'couch', 'table', 'step', 'dvd', 'cushion', 'tv', 'box', 'fish', 'post', 'ball', 'lamp', 'remote', 'person', 'leg', 'controller', 'vase']
2022-03-16 13:28:22,014.014 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'top', 'book', 'television', 'post', 'floor', 'wall', 'stand', 'bar', 'step', 'cd', 'screen', 'dog', 'flat', 'dvd', 'clock', 'cabinet', 'carpet', 'shelf', 'drawer', 'outlet', 'stool', 'railing', 'knob', 'vent', 'stair']
2022-03-16 13:30:45,705.705 2829:trainer.py:487 do_train_dict(): eta: 21:57:56  iter: 18900  speed: 300.8 images/sec  total_norm: 131.1916 (134.0222)  loss: 151.5475 (150.7730)  masked_loss: 1.6999 (1.7189)  tag_loss: 149.3567 (149.0541)  time: 1.4326 (1.7019)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4273 (1.6964)  save_time: 9.0279 (30.4322)  lr: 0.000072  max mem: 26307
2022-03-16 13:30:46,067.067 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-16 13:30:46,067.067 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 153.66168212890625
2022-03-16 13:30:46,067.067 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.98123670879163
2022-03-16 13:30:56,195.195 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01858404651284218
2022-03-16 13:30:56,195.195 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:30:56,195.195 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'rural', 'area', 'with', 'several', 'cars', 'parked', 'and', 'a', 'plant', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:30:56,210.210 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['line', 'tree', 'road', 'sign', 'street', 'pole', 'building', 'sky', 'car', 'house', 'window', 'fence', 'sidewalk', 'railing', 'wall', 'ground', 'letter', 'curb', 'roof', '[UNK]', 'bush', 'trash', 'chair', 'plant', 'word', 'grass', 'basket', 'rail', 'arrow', 'post', 'light', 'side', 'can', 'parking', 'box', 'empty', 'writing', 'graffiti', 'back', 'background', 'door', 'tire', 'bicycle', 'balcony', 'leaf', 'city', 'stop', 'next', 'corner', 'wire']
2022-03-16 13:31:12,070.070 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['house', 'area', 'several', 'line', 'building', 'road', 'street', 'car', 'wall', 'chair', 'plant', 'window', 'tree', 'rural', 'growing', 'sign', 'sky', 'background', 'pole', 'arrow', 'fence', 'sidewalk', 'curb']
2022-03-16 13:33:35,839.839 2829:trainer.py:487 do_train_dict(): eta: 21:55:19  iter: 19000  speed: 300.9 images/sec  total_norm: 133.8207 (134.2490)  loss: 153.2930 (152.0901)  masked_loss: 1.7395 (1.7414)  tag_loss: 151.4269 (150.3487)  time: 1.4338 (1.7013)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4286 (1.6962)  save_time: 9.0279 (30.4322)  lr: 0.000071  max mem: 26307
2022-03-16 13:33:36,201.201 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4324324429035187
2022-03-16 13:33:36,201.201 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 165.6912078857422
2022-03-16 13:33:36,201.201 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.9755395619657
2022-03-16 13:33:46,476.476 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018588216975331306
2022-03-16 13:33:46,476.476 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:33:46,476.476 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'living', '[MASK]', 'decor', 'with', 'couch', ',', 'love', 'seat', '[MASK]', 'rec', '##liner', ',', 'and', 'tv', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:33:46,491.491 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['curtain', 'television', 'wall', 'floor', 'window', 'room', 'chair', 'couch', 'living', 'rod', 'stand', 'remote', 'speaker', 'door', 'light', 'table', 'ceiling', 'bowl', 'sofa', 'lamp', 'basket', 'entertainment', '[UNK]', 'beam', 'mirror', 'center', 'coffee', 'pillow', 'tv', 'rug', 'picture', 'control', 'armchair', 'tile', 'fan', 'arm', 'furniture', 'shelf', 'ottoman', 'vase', 'cushion', 'flower', 'large', 'building', 'candle', 'flat', 'leather', 'plant', 'shade', 'cabinet']
2022-03-16 13:34:02,436.436 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'room', 'love', 'door', 'light', 'television', 'floor', 'table', 'wall', 'seat', 'stand', 'chair', 'window', 'bowl', 'speaker', 'ceiling', 'couch', 'remote', 'rod', 'pillow', 'curtain', 'decor']
2022-03-16 13:36:26,087.087 2829:trainer.py:487 do_train_dict(): eta: 21:52:42  iter: 19100  speed: 300.7 images/sec  total_norm: 137.3558 (139.3374)  loss: 154.1020 (155.2701)  masked_loss: 1.6360 (1.6575)  tag_loss: 152.4232 (153.6127)  time: 1.4354 (1.7025)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4303 (1.6973)  save_time: 9.0279 (30.4322)  lr: 0.000071  max mem: 26307
2022-03-16 13:36:26,448.448 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 13:36:26,448.448 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 145.3153076171875
2022-03-16 13:36:26,448.448 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.9952248732249
2022-03-16 13:36:36,758.758 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01856849528849125
2022-03-16 13:36:36,758.758 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:36:36,758.758 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'orange', 'and', 'gold', 'tour', 'amongst', 'parked', 'near', 'a', 'building', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:36:36,773.773 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'bus', 'window', 'windshield', 'cloud', 'tree', 'light', 'mirror', 'building', 'road', 'street', 'roof', '[UNK]', 'pole', 'wheel', 'fence', 'tire', 'sign', 'line', 'front', 'license', 'plate', 'person', 'door', 'shadow', 'car', 'stripe', 'grass', 'bumper', 'curb', 'logo', 'sidewalk', 'top', 'house', 'ground', 'lot', 'man', 'truck', 'white', 'hair', 'parking', 'traffic', 'bush', 'side', 'jean', 'shirt', 'letter', 'barrier', 'number', 'red']
2022-03-16 13:36:52,697.697 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'line', 'building', 'door', 'road', 'front', 'street', 'light', 'car', 'person', 'gold', 'tour', 'wall', 'window', 'sign', 'sky', 'bus', 'roof', 'plate', 'shadow', 'wheel', 'mirror', 'grass', 'license', 'cloud', 'logo', 'sidewalk', 'tire', 'curb', 'windshield']
2022-03-16 13:39:16,276.276 2829:trainer.py:487 do_train_dict(): eta: 21:50:05  iter: 19200  speed: 300.8 images/sec  total_norm: 135.2329 (137.9147)  loss: 152.9567 (152.9809)  masked_loss: 1.6233 (1.6778)  tag_loss: 151.2971 (151.3030)  time: 1.4321 (1.7019)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4271 (1.6968)  save_time: 9.0279 (30.4322)  lr: 0.000071  max mem: 26307
2022-03-16 13:39:16,637.637 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-16 13:39:16,637.637 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 187.38980102539062
2022-03-16 13:39:16,637.637 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.95733359934752
2022-03-16 13:39:26,925.925 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018559526652097702
2022-03-16 13:39:26,925.925 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:39:26,925.925 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'carrying', 'a', 'bright', 'yellow', 'bag', 'and', 'holding', 'a', 'black', '[MASK]', '[MASK]', 'her', 'head', 'is', 'walking', 'down', '##nay', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:39:26,941.941 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['fence', 'umbrella', 'ground', 'tree', 'woman', 'person', 'shoe', '[UNK]', 'leaf', 'sidewalk', 'leg', 'park', 'hair', 'bag', 'jean', 'hand', 'man', 'shirt', 'bench', 'jacket', 'ivy', 'bush', 'hat', 'girl', 'trunk', 'head', 'arm', 'coat', 'dress', 'wall', 'dirt', 'vine', 'purse', 'grass', 'skirt', 'couple', 'curb', 'plant', 'short', 'gate', 'weed', 'top', 'pole', 'backpack', 'gravel', 'foot', 'lady', 'can', 'flower', 'street']
2022-03-16 13:39:42,840.840 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['city', 'head', 'hand', 'black', 'park', 'street', 'woman', 'ground', 'person', 'wall', 'stone', 'plant', 'tree', 'jean', 'yellow', 'shirt', 'bright', 'bag', 'gate', 'leaf', 'ivy', 'fence', 'shoe', 'cane', 'sidewalk', 'umbrella', 'weed']
2022-03-16 13:42:06,669.669 2829:trainer.py:487 do_train_dict(): eta: 21:47:28  iter: 19300  speed: 300.5 images/sec  total_norm: 136.0160 (139.3437)  loss: 148.8186 (152.0599)  masked_loss: 1.6841 (1.7017)  tag_loss: 147.1089 (150.3581)  time: 1.4345 (1.7039)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4293 (1.6987)  save_time: 9.0279 (30.4322)  lr: 0.000071  max mem: 26307
2022-03-16 13:42:07,029.029 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7058823704719543
2022-03-16 13:42:07,029.029 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 130.18019104003906
2022-03-16 13:42:07,029.029 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.98372431882878
2022-03-16 13:42:17,459.459 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01857331395149231
2022-03-16 13:42:17,459.459 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:42:17,459.459 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'there', 'are', 'people', 'on', 'ski', '##s', 'in', 'the', 'snow', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:42:17,475.475 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'jacket', 'sky', 'snow', 'pole', 'ski', 'tree', 'mountain', 'glove', 'person', 'woman', 'head', 'coat', 'hand', 'boot', 'cloud', 'ground', 'helmet', 'face', 'hat', 'leg', 'man', 'hair', 'girl', 'skier', 'hill', 'top', 'poles', 'hood', 'snowy', 'slope', 'logo', 'rock', 'arm', 'foot', 'shoe', 'gear', 'strap', 'zipper', 'couple', 'scarf', 'lift', 'sign', 'building', 'board', 'backpack', 'picture', 'glasses', 'front', 'sunglasses']
2022-03-16 13:42:33,432.432 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'woman', 'person', 'mountain', 'tree', 'sky', 'snow', 'hat', 'pole', 'jacket', 'ski', 'boot', 'helmet', 'glove', 'strap', 'stripe', 'zipper']
03-16 13:42:39.341 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 13:42:39.341 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 13:42:40.567 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 13:44:56,977.977 2829:trainer.py:487 do_train_dict(): eta: 21:44:50  iter: 19400  speed: 300.6 images/sec  total_norm: 135.7278 (136.0436)  loss: 152.6704 (152.0363)  masked_loss: 1.7860 (1.7910)  tag_loss: 151.1162 (150.2453)  time: 1.4347 (1.7032)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4297 (1.6981)  save_time: 9.0279 (30.4322)  lr: 0.000071  max mem: 26307
2022-03-16 13:44:57,338.338 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-16 13:44:57,338.338 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 145.83840942382812
2022-03-16 13:44:57,338.338 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.96997005756084
2022-03-16 13:45:07,744.744 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01858498528599739
2022-03-16 13:45:07,744.744 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:45:07,745.745 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'inn', '##ards', 'of', 'a', 'wall', 'have', 'been', '[MASK]', 'during', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:45:07,760.760 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'floor', 'pipe', 'ground', 'shelf', 'bag', 'tool', 'room', 'handle', 'box', '[UNK]', 'paper', 'door', 'cord', 'bucket', 'hose', 'rock', 'board', 'cardboard', 'wire', 'window', 'pole', 'broom', 'drain', 'bathroom', 'hole', 'bottle', 'wheel', 'dirty', 'stain', 'suitcase', 'table', 'container', 'hammer', 'lid', 'cup', 'leg', 'building', 'construction', 'item', 'toilet', 'strap', 'tile', 'knob', 'light', 'towel', 'wood', 'ladder', 'old', 'outlet']
2022-03-16 13:45:23,712.712 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['room', 'building', 'case', 'ground', 'floor', 'construction', 'wall', 'paper', 'window', 'box', 'card', 'bag', 'handle', 'pole', 'tool', 'pipe', 'item', 'shelf', 'drain', 'tile', 'broom']
2022-03-16 13:47:47,351.351 2829:trainer.py:487 do_train_dict(): eta: 21:42:13  iter: 19500  speed: 300.5 images/sec  total_norm: 132.4792 (135.8304)  loss: 149.4608 (148.4182)  masked_loss: 1.6395 (1.6832)  tag_loss: 147.6708 (146.7350)  time: 1.4342 (1.7037)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4292 (1.6986)  save_time: 9.0279 (30.4322)  lr: 0.000071  max mem: 26307
2022-03-16 13:47:47,712.712 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5526315569877625
2022-03-16 13:47:47,713.713 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 147.90469360351562
2022-03-16 13:47:47,713.713 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.97463286652857
2022-03-16 13:47:58,197.197 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018621616065502167
2022-03-16 13:47:58,197.197 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:47:58,198.198 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'haul', 'of', 'bananas', ',', 'bread', ',', 'onions', ',', 'potatoes', ',', 'milk', ',', '[MASK]', 'and', 'more', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:47:58,213.213 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['banana', 'bag', '[UNK]', 'plastic', 'table', 'fruit', 'label', 'package', 'apple', 'market', 'sign', 'bunch', 'box', 'garlic', 'food', 'vegetable', 'tag', 'onion', 'orange', 'tomato', 'carrot', 'bread', 'bananas', 'floor', 'writing', 'stem', 'potato', 'cookie', 'logo', 'leaf', 'chip', 'basket', 'store', 'wall', 'sale', 'pile', 'produce', 'hand', 'paper', 'different', 'full', 'letter', 'display', 'grocery', 'coconut', 'container', 'light', 'person', 'top', 'sack']
2022-03-16 13:48:14,092.092 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'table', 'food', 'box', 'bag', 'plastic', 'apple', 'tag', 'milk', 'cream', 'bread', 'logo', 'grocery', 'haul', 'banana', 'strap', 'cookie']
2022-03-16 13:50:37,774.774 2829:trainer.py:487 do_train_dict(): eta: 21:39:36  iter: 19600  speed: 300.4 images/sec  total_norm: 135.4929 (139.4704)  loss: 152.5865 (153.6256)  masked_loss: 1.7373 (1.7342)  tag_loss: 151.3774 (151.8914)  time: 1.4339 (1.7043)  data: 0.0002 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4289 (1.6993)  save_time: 9.0279 (30.4322)  lr: 0.000070  max mem: 26307
2022-03-16 13:50:38,135.135 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7058823704719543
2022-03-16 13:50:38,135.135 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 112.56956481933594
2022-03-16 13:50:38,135.135 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.01120473406641
2022-03-16 13:50:48,696.696 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01861617900431156
2022-03-16 13:50:48,697.697 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:50:48,697.697 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'close', '##up', 'of', 'a', 'pizza', 'with', '[MASK]', 'and', 'sauce', 'on', 'a', 'pan', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:50:48,712.712 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['pizza', '[UNK]', 'plate', 'crust', 'cheese', 'handle', 'sink', 'knife', 'pan', 'table', 'stove', 'shadow', 'sauce', 'tray', 'bowl', 'bubble', 'glass', 'top', 'meat', 'cup', 'background', 'bottle', 'oven', 'mushroom', 'spoon', 'light', 'cloth', 'slice', 'white', 'hole', 'metal', 'food', 'dish', 'olive', 'large', 'container', 'napkin', 'base', 'water', 'close', 'ready', 'fork', 'wall', 'stripe', 'surface', 'object', 'small', 'knob', 'cooked', 'screw']
2022-03-16 13:51:04,702.702 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'light', 'table', 'paper', 'background', 'roll', 'handle', 'plate', 'knife', 'pan', 'sink', 'cheese', 'towel', 'pizza', 'sauce', 'spoon', 'stove', 'oven', 'crust', 'bun']
2022-03-16 13:53:28,224.224 2829:trainer.py:487 do_train_dict(): eta: 21:36:59  iter: 19700  speed: 300.4 images/sec  total_norm: 132.4779 (136.2762)  loss: 146.1230 (148.4942)  masked_loss: 1.6040 (1.6749)  tag_loss: 144.3014 (146.8193)  time: 1.4332 (1.7045)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4284 (1.6995)  save_time: 9.0279 (30.4322)  lr: 0.000070  max mem: 26307
2022-03-16 13:53:28,587.587 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5277777910232544
2022-03-16 13:53:28,587.587 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 194.71005249023438
2022-03-16 13:53:28,587.587 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.98912100358443
2022-03-16 13:53:39,197.197 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018570857122540474
2022-03-16 13:53:39,197.197 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:53:39,198.198 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bath', 'tub', 'sitting', 'next', 'to', 'a', '[MASK]', '[MASK]', 'in', 'a', 'bathroom', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:53:39,213.213 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bathroom', 'wall', '[UNK]', 'mirror', 'toilet', 'sink', 'tub', 'shower', 'floor', 'towel', 'head', 'handle', 'lid', 'rack', 'ceiling', 'tile', 'light', 'paper', 'soap', 'holder', 'outlet', 'door', 'bar', 'dish', 'drain', 'shelf', 'white', 'glass', 'tank', 'reflection', 'rod', 'box', 'vent', 'large', 'bath', 'knob', 'roll', 'bottle', 'window', 'plate', 'clean', 'cabinet', 'switch', 'next', 'tissue', 'sign', 'small', 'hand', 'ledge', 'bowl']
2022-03-16 13:53:55,156.156 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'next', 'white', 'door', 'floor', 'wall', 'paper', 'window', 'roll', 'handle', 'mirror', 'bathroom', 'ceiling', 'shower', 'bath', 'sink', 'soap', 'pipe', 'holder', 'towel', 'shelf', 'toilet', 'outlet', 'tile', 'tub', 'rack']
2022-03-16 13:56:18,888.888 2829:trainer.py:487 do_train_dict(): eta: 21:34:22  iter: 19800  speed: 300.0 images/sec  total_norm: 133.4636 (136.8923)  loss: 155.7876 (153.8654)  masked_loss: 1.6727 (1.7413)  tag_loss: 153.7996 (152.1241)  time: 1.4343 (1.7066)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4294 (1.7014)  save_time: 9.0279 (30.4322)  lr: 0.000070  max mem: 26307
2022-03-16 13:56:19,249.249 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-16 13:56:19,249.249 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 154.24099731445312
2022-03-16 13:56:19,249.249 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.99470502767132
2022-03-16 13:56:29,852.852 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01856466569006443
2022-03-16 13:56:29,852.852 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:56:29,854.854 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'three', 'people', 'standing', 'in', 'the', 'snow', 'and', 'holding', '[MASK]', '[MASK]', 'in', '[MASK]', 'hands', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:56:29,869.869 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['glove', 'jacket', 'building', 'ski', 'snow', 'window', '[UNK]', 'man', 'head', 'fence', 'ground', 'coat', 'pole', 'helmet', 'boot', 'person', 'hand', 'tag', 'door', 'hat', 'face', 'railing', 'wall', 'tree', 'woman', 'shoe', 'leg', 'roof', 'next', 'gear', 'red', 'house', 'balcony', 'sign', 'poles', 'pile', 'camera', 'boy', 'handle', 'skier', 'backpack', 'gate', 'foot', 'snowy', 'light', 'flag', 'arm', 'badge', 'front', 'couple']
2022-03-16 13:56:45,819.819 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'group', 'hand', 'building', 'door', 'woman', 'ground', 'person', 'base', 'window', 'tree', 'snow', 'coat', 'hat', 'tag', 'pole', 'jacket', 'ski', 'fence', 'boot', 'helmet', 'poles', 'shoe', 'glove']
2022-03-16 13:59:09,346.346 2829:trainer.py:487 do_train_dict(): eta: 21:31:44  iter: 19900  speed: 300.4 images/sec  total_norm: 133.3119 (138.6806)  loss: 150.1005 (148.5715)  masked_loss: 1.6454 (1.6685)  tag_loss: 148.6084 (146.9031)  time: 1.4334 (1.7046)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4286 (1.6997)  save_time: 9.0279 (30.4322)  lr: 0.000070  max mem: 26307
2022-03-16 13:59:09,707.707 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-16 13:59:09,707.707 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 158.2715606689453
2022-03-16 13:59:09,707.707 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.00655279159545
2022-03-16 13:59:20,371.371 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0185573510825634
2022-03-16 13:59:20,371.371 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 13:59:20,372.372 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'gi', '##raf', '##fe', 'bending', 'over', 'and', 'eating', '[MASK]', 'grass', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 13:59:20,387.387 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'head', 'spot', 'neck', 'mane', 'eye', 'tree', 'wall', 'ear', 'horn', 'ground', 'rock', 'mouth', 'zoo', 'trunk', 'grass', 'fence', 'hair', 'nose', 'pole', 'hay', 'plant', 'leg', 'face', 'tongue', 'bush', 'branch', 'enclosure', 'boulder', 'dirt', 'leaf', 'paw', 'next', 'pen', 'rope', 'trough', 'basket', 'stick', 'building', 'food', 'tail', 'container', 'ledge', 'standing', 'shirt', 'post', 'water', 'bird', 'handle', 'mesh']
2022-03-16 13:59:36,320.320 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'ground', 'rock', 'mouth', 'wall', 'eye', 'neck', 'tree', 'spot', 'leg', 'tongue', 'ear', 'grass', 'pole', 'leaf', 'horn', 'fence', 'zoo', 'cord', 'bending', 'mane']
2022-03-16 14:02:00,056.056 2829:trainer.py:487 do_train_dict(): eta: 21:29:07  iter: 20000  speed: 299.9 images/sec  total_norm: 137.3051 (139.5711)  loss: 150.4454 (151.1837)  masked_loss: 1.6972 (1.6941)  tag_loss: 149.2023 (149.4895)  time: 1.4353 (1.7071)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4300 (1.7016)  save_time: 9.0279 (30.4322)  lr: 0.000070  max mem: 26307
2022-03-16 14:02:00,058.058 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt
2022-03-16 14:02:09,169.169 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6315789222717285
2022-03-16 14:02:09,169.169 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 174.6197967529297
2022-03-16 14:02:09,169.169 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.00547940458232
2022-03-16 14:02:19,866.866 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018538890406489372
2022-03-16 14:02:19,866.866 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:02:19,866.866 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bed', 'sitting', 'in', 'a', 'bedroom', 'next', 'to', 'a', 'window', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:02:19,881.881 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['room', 'curtain', 'bed', 'chair', 'lamp', 'wall', 'floor', 'table', 'window', 'pillow', 'blanket', 'shade', 'bedroom', 'base', 'hotel', 'carpet', '[UNK]', 'vent', 'desk', 'sheet', 'ceiling', 'light', 'armchair', 'mirror', 'nightstand', 'television', 'wheel', 'leg', 'picture', 'drawer', 'white', 'phone', 'large', 'cabinet', 'arm', 'paper', 'dresser', 'air', 'red', 'bag', 'back', 'handle', 'outlet', 'book', 'stand', 'door', 'remote', 'cushion', 'telephone', 'furniture']
2022-03-16 14:02:35,604.604 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'room', 'television', 'floor', 'bed', 'table', 'wall', 'chair', 'window', 'bowl', 'desk', 'bedroom', 'handle', 'wheel', 'sheet', 'shade', 'blanket', 'pillow', 'carpet', 'lamp', 'curtain', 'drawer', 'dresser']
2022-03-16 14:04:58,641.641 2829:trainer.py:487 do_train_dict(): eta: 21:26:48  iter: 20100  speed: 286.7 images/sec  total_norm: 136.4110 (141.9151)  loss: 149.8997 (151.4312)  masked_loss: 1.6591 (1.6626)  tag_loss: 148.0607 (149.7686)  time: 1.4343 (1.7859)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4294 (1.6934)  save_time: 8.8805 (25.0095)  lr: 0.000070  max mem: 26307
2022-03-16 14:04:59,002.002 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5714285969734192
2022-03-16 14:04:59,003.003 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 167.5060577392578
2022-03-16 14:04:59,003.003 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.99254791335304
2022-03-16 14:05:09,813.813 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01850227825343609
2022-03-16 14:05:09,813.813 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:05:09,814.814 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'family', 'is', 'grouped', 'on', '[MASK]', 'sun', 'porch', '[MASK]', 'a', 'photo', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:05:09,829.829 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'girl', 'tie', 'hand', 'head', 'shirt', 'wall', 'boy', 'jacket', 'child', 'woman', 'ear', 'scarf', '[UNK]', 'door', 'face', 'table', 'eye', 'sweater', 'window', 'group', 'nose', 'person', 'cup', 'picture', 'dress', 'young', 'paper', 'ponytail', 'floor', 'chair', 'jean', 'bag', 'kid', 'man', 'suit', 'little', 'basket', 'necklace', 'container', 'bottle', 'handle', 'stripe', 'bow', 'can', 'plate', 'shoe', 'glasses', 'bracelet', 'smile']
2022-03-16 14:05:25,815.815 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'family', 'man', 'house', 'hand', 'face', 'door', 'woman', 'cup', 'hair', 'girl', 'child', 'wall', 'boy', 'sun', 'eye', 'window', 'shirt', 'ear', 'tie', 'photo', 'blind', 'jacket', 'porch', 'sweater', 'ponytail', 'scarf']
2022-03-16 14:07:49,436.436 2829:trainer.py:487 do_train_dict(): eta: 21:24:10  iter: 20200  speed: 299.8 images/sec  total_norm: 134.9062 (139.5070)  loss: 152.6627 (153.4421)  masked_loss: 1.6303 (1.6457)  tag_loss: 150.9426 (151.7963)  time: 1.4342 (1.7080)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4291 (1.7028)  save_time: 8.8805 (25.0095)  lr: 0.000070  max mem: 26307
2022-03-16 14:07:49,797.797 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.625
2022-03-16 14:07:49,798.798 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 169.94622802734375
2022-03-16 14:07:49,798.798 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.97797204003545
2022-03-16 14:08:00,580.580 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018487296998500824
2022-03-16 14:08:00,580.580 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:08:00,580.580 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'bananas', 'for', '[MASK]', 'that', 'are', '[MASK]', 'on', 'news', '##print', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:08:00,596.596 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['banana', 'table', 'newspaper', 'bunch', 'stem', '[UNK]', 'paper', 'fruit', 'person', 'picture', 'magazine', 'plate', 'man', 'pile', 'wall', 'bowl', 'pole', 'bag', 'box', 'cloth', 'basket', 'book', 'shirt', 'sign', 'ripe', 'display', 'spot', 'top', 'woman', 'background', 'bananas', 'writing', 'shelf', 'container', 'group', 'other', 'market', 'head', 'sale', 'photo', 'flower', 'hand', 'floor', 'plastic', 'hair', 'apple', 'orange', 'end', 'many', 'ground']
2022-03-16 14:08:16,457.457 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'table', 'newspaper', 'picture', 'bowl', 'sale', 'plate', 'stem', 'bunch', 'shelf', 'banana']
2022-03-16 14:10:40,199.199 2829:trainer.py:487 do_train_dict(): eta: 21:21:33  iter: 20300  speed: 299.8 images/sec  total_norm: 135.6405 (138.1252)  loss: 153.9690 (152.9460)  masked_loss: 1.5877 (1.6659)  tag_loss: 152.2151 (151.2801)  time: 1.4326 (1.7076)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4274 (1.7025)  save_time: 8.8805 (25.0095)  lr: 0.000069  max mem: 26307
2022-03-16 14:10:40,559.559 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-16 14:10:40,559.559 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 168.33628845214844
2022-03-16 14:10:40,559.559 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.98192755381267
2022-03-16 14:10:51,523.523 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018468158319592476
2022-03-16 14:10:51,523.523 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:10:51,523.523 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'there', 'is', 'a', 'bedroom', 'with', 'a', '[MASK]', 'and', 'desk', 'with', 'chair', 'in', 'it', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:10:51,538.538 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'floor', 'lamp', 'desk', 'keyboard', 'chair', 'door', 'computer', 'window', 'monitor', 'room', 'shade', 'bed', 'blind', 'rug', 'table', 'bag', 'bedroom', 'blanket', 'mouse', '[UNK]', 'cushion', 'leg', 'handle', 'pillow', 'picture', 'screen', 'laptop', 'drawer', 'outlet', 'office', 'light', 'box', 'knob', 'book', 'basket', 'switch', 'backpack', 'paper', 'shelf', 'phone', 'back', 'vent', 'home', 'frame', 'carpet', 'speaker', 'mat', 'base', 'pad']
2022-03-16 14:11:07,456.456 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['room', 'door', 'light', 'floor', 'bed', 'wall', 'glass', 'chair', 'computer', 'window', 'bag', 'desk', 'bedroom', 'blind', 'remote', 'switch', 'monitor', 'shade', 'blanket', 'keyboard', 'pillow', 'lamp', 'backpack', 'mat', 'rug', 'cushion']
03-16 14:12:40.618 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 14:12:40.618 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 14:12:41.737 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 94}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 14:13:31,043.043 2829:trainer.py:487 do_train_dict(): eta: 21:18:55  iter: 20400  speed: 299.7 images/sec  total_norm: 134.2532 (136.4123)  loss: 148.6723 (148.6688)  masked_loss: 1.7369 (1.6979)  tag_loss: 146.8832 (146.9710)  time: 1.4342 (1.7084)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4291 (1.7033)  save_time: 8.8805 (25.0095)  lr: 0.000069  max mem: 26307
2022-03-16 14:13:31,403.403 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-16 14:13:31,404.404 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.09841918945312
2022-03-16 14:13:31,404.404 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.99751797187619
2022-03-16 14:13:42,350.350 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01845521479845047
2022-03-16 14:13:42,350.350 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:13:42,351.351 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'running', 'down', 'part', 'of', 'a', 'half', 'pipe', 'while', 'holding', 'a', 'skate', '##board', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:13:42,366.366 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'helmet', 'man', '[UNK]', 'ramp', 'pad', 'shoe', 'knee', 'hand', 'short', 'head', 'wheel', 'sock', 'leg', 'elbow', 'skate', 'arm', 'line', 'ground', 'strap', 'shadow', 'glove', 'park', 'board', 'person', 'tree', 'face', 'logo', 'boy', 'skater', 'sky', 'wall', 'sign', 'mountain', 'trick', 'building', 'grass', 'fence', 'hair', 'guy', 'hill', 'slope', 'pole', 'background', 'sleeve', 'bowl', 'snow', 'street', 'light', 'house']
2022-03-16 14:13:58,344.344 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'part', 'head', 'man', 'hand', 'face', 'half', 'short', 'ground', 'arm', 'shirt', 'snow', 'wheel', 'knee', 'pipe', 'elbow', 'helmet', 'shoe', 'pad', 'ramp', 'strap', 'sock']
2022-03-16 14:16:21,941.941 2829:trainer.py:487 do_train_dict(): eta: 21:16:18  iter: 20500  speed: 299.6 images/sec  total_norm: 133.7125 (136.8156)  loss: 148.6973 (151.9906)  masked_loss: 1.6160 (1.6008)  tag_loss: 147.0124 (150.3898)  time: 1.4338 (1.7090)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4286 (1.7039)  save_time: 8.8805 (25.0095)  lr: 0.000069  max mem: 26307
2022-03-16 14:16:22,302.302 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6388888955116272
2022-03-16 14:16:22,302.302 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 164.01275634765625
2022-03-16 14:16:22,302.302 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.98418198742912
2022-03-16 14:16:33,189.189 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018488284200429916
2022-03-16 14:16:33,189.189 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:16:33,189.189 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'black', 'dog', 'standing', 'on', 'top', 'of', 'a', 'tile', 'floor', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:16:33,204.204 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'room', 'floor', 'table', 'chair', '[UNK]', 'window', 'ceiling', 'picture', 'light', 'television', 'door', 'cabinet', 'mirror', 'shelf', 'couch', 'rug', 'pillow', 'fireplace', 'handle', 'kitchen', 'leg', 'cushion', 'glass', 'lamp', 'curtain', 'paper', 'shade', 'mantle', 'coffee', 'book', 'shirt', 'box', 'sofa', 'living', 'drawer', 'stool', 'carpet', 'clock', 'bag', 'microwave', 'hair', 'man', 'flower', 'remote', 'pot', 'switch', 'head', 'hand', 'building']
2022-03-16 14:16:49,113.113 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'room', 'black', 'top', 'door', 'floor', 'wall', 'chest', 'eye', 'wood', 'ring', 'picture', 'dog', 'leg', 'nose', 'ear', 'cabinet', 'mirror', 'ceiling', 'patch', 'collar', 'cart', 'tile', 'stool', 'refrigerator']
2022-03-16 14:19:12,748.748 2829:trainer.py:487 do_train_dict(): eta: 21:13:40  iter: 20600  speed: 299.8 images/sec  total_norm: 134.9857 (136.9441)  loss: 142.5376 (145.9681)  masked_loss: 1.7123 (1.7453)  tag_loss: 141.2660 (144.2227)  time: 1.4327 (1.7081)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4275 (1.7029)  save_time: 8.8805 (25.0095)  lr: 0.000069  max mem: 26307
2022-03-16 14:19:13,111.111 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-16 14:19:13,111.111 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.69049072265625
2022-03-16 14:19:13,111.111 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.97640001601067
2022-03-16 14:19:24,117.117 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018558017909526825
2022-03-16 14:19:24,118.118 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:19:24,118.118 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'bent', '##o', 'boxes', '[MASK]', 'a', 'variety', 'of', 'healthy', 'foods', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:19:24,134.134 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['carrot', 'table', 'cheese', 'food', 'tomato', 'grape', 'meat', '[UNK]', 'container', 'candy', 'bowl', 'star', 'slice', 'fruit', 'lemon', 'box', 'cookie', 'sausage', 'dish', 'plastic', 'vegetable', 'orange', 'mushroom', 'nut', 'lunch', 'tray', 'cake', 'bread', 'onion', 'dessert', 'flower', 'lid', 'potato', 'plate', 'bean', 'fork', 'ball', 'stem', 'face', 'cloth', 'different', 'paper', 'almond', 'handle', 'logo', 'egg', 'sandwich', 'piece', 'banana', 'pea']
2022-03-16 14:19:40,103.103 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'star', 'table', 'food', 'box', 'variety', 'piece', 'meat', 'healthy', 'cheese', 'candy', 'container', 'slice', 'grape', 'mushroom', 'tomato', 'sausage', 'carrot']
2022-03-16 14:22:03,630.630 2829:trainer.py:487 do_train_dict(): eta: 21:11:02  iter: 20700  speed: 299.6 images/sec  total_norm: 134.2216 (136.0428)  loss: 151.2389 (152.7164)  masked_loss: 1.5952 (1.6751)  tag_loss: 149.5527 (151.0414)  time: 1.4329 (1.7088)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4278 (1.7036)  save_time: 8.8805 (25.0095)  lr: 0.000069  max mem: 26307
2022-03-16 14:22:03,991.991 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-16 14:22:03,991.991 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 147.8448486328125
2022-03-16 14:22:03,991.991 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.97490704976596
2022-03-16 14:22:15,018.018 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018558382987976074
2022-03-16 14:22:15,018.018 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:22:15,019.019 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'small', 'cake', '[MASK]', 'some', 'candles', 'on', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:22:15,034.034 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['candle', 'cake', 'table', 'wall', '[UNK]', 'birthday', 'holder', 'cloth', 'curtain', 'box', 'room', 'glass', 'chair', 'writing', 'flame', 'paper', 'napkin', 'cardboard', 'plate', 'door', 'window', 'blue', 'hair', 'fork', 'shirt', 'knife', 'person', 'carpet', 'tray', 'sign', 'floor', 'top', 'cookie', 'base', 'book', 'woman', 'handle', 'word', 'stand', 'dress', 'flower', 'display', 'cup', 'ceiling', 'star', 'hand', 'card', 'spoon', 'decoration', 'bottle']
2022-03-16 14:22:31,011.011 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'small', 'door', 'table', 'wall', 'chair', 'paper', 'box', 'sign', 'picture', 'frame', 'handle', 'plate', 'pole', 'cloth', 'flame', 'holder', 'lighter', 'cake', 'curtain', 'necklace', 'outlet', 'candle']
2022-03-16 14:24:54,607.607 2829:trainer.py:487 do_train_dict(): eta: 21:08:25  iter: 20800  speed: 299.5 images/sec  total_norm: 137.4580 (138.7475)  loss: 152.2684 (152.1884)  masked_loss: 1.7015 (1.6724)  tag_loss: 150.7429 (150.5159)  time: 1.4337 (1.7098)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4285 (1.7047)  save_time: 8.8805 (25.0095)  lr: 0.000069  max mem: 26307
2022-03-16 14:24:54,968.968 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-16 14:24:54,968.968 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 180.63795471191406
2022-03-16 14:24:54,968.968 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.96078127993351
2022-03-16 14:25:06,129.129 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01858743093907833
2022-03-16 14:25:06,129.129 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:25:06,130.130 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'pretty', '[MASK]', 'with', 'some', 'soup', 'in', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:25:06,145.145 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bowl', 'table', 'plate', 'soup', '[UNK]', 'spoon', 'design', 'food', 'onion', 'glass', 'cup', 'shadow', 'handle', 'vegetable', 'leaf', 'flower', 'napkin', 'white', 'sauce', 'line', 'pepper', 'dish', 'cheese', 'meat', 'pea', 'lemon', 'fork', 'carrot', 'salad', 'fish', 'pasta', 'mushroom', 'shrimp', 'knife', 'reflection', 'potato', 'chicken', 'bread', 'rice', 'cloth', 'herb', 'egg', 'bowls', 'green', 'water', 'top', 'fruit', 'logo', 'cream', 'orange']
2022-03-16 14:25:22,062.062 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'design', 'table', 'food', 'glass', 'pretty', 'bowl', 'plate', 'flower', 'fork', 'soup', 'pepper', 'napkin']
2022-03-16 14:27:45,681.681 2829:trainer.py:487 do_train_dict(): eta: 21:05:47  iter: 20900  speed: 299.3 images/sec  total_norm: 138.3309 (141.0760)  loss: 149.7887 (150.5588)  masked_loss: 1.6607 (1.6784)  tag_loss: 148.1349 (148.8804)  time: 1.4333 (1.7107)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4283 (1.7057)  save_time: 8.8805 (25.0095)  lr: 0.000069  max mem: 26307
2022-03-16 14:27:46,044.044 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5625
2022-03-16 14:27:46,045.045 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 169.3606719970703
2022-03-16 14:27:46,045.045 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 69.9681798480806
2022-03-16 14:27:57,147.147 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018563518300652504
2022-03-16 14:27:57,147.147 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:27:57,148.148 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'motorcycle', 'rider', 'driving', 'down', '[MASK]', 'referring', 'dirt', 'road', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:27:57,163.163 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'road', 'bush', 'motorcycle', 'helmet', 'forest', 'wood', 'man', 'person', 'jacket', '[UNK]', 'tire', 'path', 'branch', 'grass', 'windshield', 'bike', 'trunk', 'trail', 'wheel', 'ground', 'vehicle', 'mirror', 'dirt', 'leaf', 'car', 'wooded', 'head', 'light', 'truck', 'line', 'track', 'hat', 'hill', 'plant', 'rock', 'side', 'shirt', 'glove', 'group', 'sky', 'area', 'sign', 'brush', 'country', 'bag', 'pole', 'small', 'wall', 'biker']
2022-03-16 14:28:13,101.101 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'road', 'person', 'forest', 'tree', 'wood', 'trail', 'mirror', 'bush', 'dirt', 'rider', 'flame', 'motorcycle', 'helmet', 'tire', 'glove', 'wooded']
2022-03-16 14:30:36,960.960 2829:trainer.py:487 do_train_dict(): eta: 21:03:10  iter: 21000  speed: 298.9 images/sec  total_norm: 134.4053 (138.1352)  loss: 148.8923 (150.7054)  masked_loss: 1.5951 (1.6500)  tag_loss: 147.4959 (149.0553)  time: 1.4347 (1.7128)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4297 (1.7077)  save_time: 8.8805 (25.0095)  lr: 0.000068  max mem: 26307
2022-03-16 14:30:37,320.320 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6129032373428345
2022-03-16 14:30:37,320.320 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 126.22579956054688
2022-03-16 14:30:37,321.321 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.0021768181245
2022-03-16 14:30:48,433.433 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018592912703752518
2022-03-16 14:30:48,433.433 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:30:48,434.434 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'is', 'squat', '##ting', 'outside', 'by', 'some', '[MASK]', '##s', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:30:48,449.449 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'wall', 'building', '[UNK]', 'flower', 'blind', 'sidewalk', 'brick', 'shoe', 'hand', 'man', 'leg', 'plant', 'head', 'arm', 'ground', 'hair', 'shadow', 'person', 'jacket', 'step', 'ledge', 'line', 'face', 'black', 'curb', 'woman', 'shirt', 'pole', 'block', 'jean', 'bar', 'door', 'white', 'leaf', 'coat', 'sign', 'front', 'hat', 'bag', 'road', 'wheel', 'umbrella', 'handle', 'cap', 'light', 'pot', 'base', 'bicycle', 'street']
2022-03-16 14:31:04,460.460 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'building', 'woman', 'short', 'girl', 'outside', 'person', 'wall', 'arm', 'plant', 'window', 'step', 'watch', 'box', 'shirt', 'bag', 'camera', 'handle', 'hat', 'blind', 'tag', 'flower', 'hood', 'arrow', 'boot', 'sidewalk', 'backpack', 'suitcase', 'strap', 'luggage']
2022-03-16 14:33:28,104.104 2829:trainer.py:487 do_train_dict(): eta: 21:00:32  iter: 21100  speed: 299.2 images/sec  total_norm: 132.8051 (134.6574)  loss: 153.2119 (152.9291)  masked_loss: 1.6315 (1.6781)  tag_loss: 151.5865 (151.2510)  time: 1.4343 (1.7115)  data: 0.0001 (0.0005)  to_device: 0.0050 (0.0048)  time_gpu: 1.4294 (1.7061)  save_time: 8.8805 (25.0095)  lr: 0.000068  max mem: 26307
2022-03-16 14:33:28,465.465 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-16 14:33:28,465.465 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 131.84347534179688
2022-03-16 14:33:28,465.465 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.01844263976474
2022-03-16 14:33:39,667.667 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01859108731150627
2022-03-16 14:33:39,668.668 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:33:39,668.668 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'person', 'riding', 'banking', 'snow', '##board', 'on', 'a', 'mountain', 'slope', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:33:39,683.683 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'jacket', 'man', 'snow', 'ground', 'person', 'glove', 'head', 'arm', 'coat', 'hand', 'helmet', 'board', 'face', 'sky', 'leg', 'hill', 'tree', 'mountain', 'slope', 'boot', 'hood', 'foot', 'hat', 'yellow', 'track', 'skier', 'pole', 'air', 'sleeve', 'design', 'snowy', 'logo', 'cloud', 'hair', 'ski', 'stripe', 'tag', 'line', 'background', 'strap', 'black', 'steep', 'shadow', 'side', 'patch', 'backpack', 'boy', 'scarf', 'shoe']
2022-03-16 14:33:55,587.587 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'ground', 'board', 'person', 'mountain', 'sky', 'snow', 'coat', 'cloud', 'jacket', 'slope', 'helmet']
2022-03-16 14:36:19,412.412 2829:trainer.py:487 do_train_dict(): eta: 20:57:54  iter: 21200  speed: 298.9 images/sec  total_norm: 133.5796 (136.8119)  loss: 149.9103 (151.3754)  masked_loss: 1.6800 (1.6548)  tag_loss: 148.4138 (149.7206)  time: 1.4337 (1.7130)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4285 (1.7079)  save_time: 8.8805 (25.0095)  lr: 0.000068  max mem: 26307
2022-03-16 14:36:19,772.772 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.65625
2022-03-16 14:36:19,772.772 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 124.99519348144531
2022-03-16 14:36:19,773.773 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.02488352099495
2022-03-16 14:36:31,032.032 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018592767417430878
2022-03-16 14:36:31,032.032 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:36:31,032.032 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'pair', 'of', 'head', '##phones', 'in', 'a', 'package', 'on', 'a', 'table', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:36:31,048.048 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['scissors', 'table', 'handle', '[UNK]', 'blade', 'cord', 'light', 'box', 'tape', 'pair', 'desk', 'plastic', 'floor', 'wall', 'wire', 'base', 'chair', 'container', 'case', 'drawer', 'strap', 'blue', 'leg', 'paper', 'top', 'stand', 'display', 'cloth', 'wooden', 'string', 'pen', 'door', 'person', 'book', 'band', 'lid', 'laptop', 'cabinet', 'screen', 'bag', 'computer', 'white', 'tray', 'open', 'screw', 'knife', 'button', 'phone', 'hole', 'vent']
2022-03-16 14:36:46,982.982 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'light', 'person', 'table', 'box', 'jean', 'pair', 'handle', 'blade', 'tape', 'package', 'cord', 'scissors']
2022-03-16 14:39:10,647.647 2829:trainer.py:487 do_train_dict(): eta: 20:55:17  iter: 21300  speed: 299.0 images/sec  total_norm: 136.7047 (139.6794)  loss: 150.6798 (150.1393)  masked_loss: 1.6671 (1.6495)  tag_loss: 149.0127 (148.4899)  time: 1.4341 (1.7124)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4289 (1.7073)  save_time: 8.8805 (25.0095)  lr: 0.000068  max mem: 26307
2022-03-16 14:39:11,009.009 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-16 14:39:11,010.010 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 130.20590209960938
2022-03-16 14:39:11,010.010 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.03436915673942
2022-03-16 14:39:22,297.297 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01860078237950802
2022-03-16 14:39:22,297.297 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:39:22,298.298 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'bus', 'pulled', 'up', 'to', '[MASK]', 'empty', 'bus', 'stop', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:39:22,313.313 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'bus', 'windshield', 'building', 'sky', 'sign', 'front', 'fence', 'flag', 'road', '[UNK]', 'light', 'street', 'pole', 'number', 'door', 'mirror', 'railing', 'wheel', 'tire', 'roof', 'banner', 'license', 'sidewalk', 'plate', 'line', 'driver', 'advertisement', 'rail', 'car', 'rack', 'letter', 'stripe', 'logo', 'wall', 'white', 'tree', 'post', 'bumper', 'stop', 'ground', 'person', 'man', 'large', 'top', 'city', 'arrow', 'bike', 'floor', 'water']
2022-03-16 14:39:38,222.222 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'number', 'public', 'water', 'building', 'door', 'road', 'front', 'street', 'ground', 'stop', 'person', 'window', 'sky', 'bus', 'empty', 'roof', 'flag', 'wheel', 'mirror', 'cloud', 'pole', 'arrow', 'fence', 'tent', 'banner', 'trash', 'railing', 'stripe', 'windshield']
2022-03-16 14:42:01,951.951 2829:trainer.py:487 do_train_dict(): eta: 20:52:39  iter: 21400  speed: 298.9 images/sec  total_norm: 134.8360 (137.4399)  loss: 149.9980 (152.7794)  masked_loss: 1.6302 (1.6743)  tag_loss: 148.6584 (151.1051)  time: 1.4336 (1.7131)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4287 (1.7079)  save_time: 8.8805 (25.0095)  lr: 0.000068  max mem: 26307
2022-03-16 14:42:02,312.312 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-16 14:42:02,312.312 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 149.53526306152344
2022-03-16 14:42:02,312.312 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.03504282042037
2022-03-16 14:42:13,770.770 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018604183569550514
2022-03-16 14:42:13,770.770 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:42:13,770.770 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'plate', 'holding', 'a', 'pizza', 'next', 'to', 'book', 'and', 'glass', 'of', 'wine', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:42:13,786.786 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['pizza', 'glass', 'table', 'plate', 'crust', 'wine', 'base', 'fork', 'book', 'slice', '[UNK]', 'stem', 'letter', 'knife', 'napkin', 'handle', 'newspaper', 'cheese', 'bubble', 'paper', 'topping', 'picture', 'reflection', 'pie', 'drink', 'top', 'writing', 'menu', 'food', 'chicken', 'magazine', 'meat', 'next', 'bird', 'ice', 'water', 'olive', 'bottom', 'shadow', 'white', 'foam', 'bottle', 'cup', 'piece', 'red', 'tray', 'bacon', 'person', 'spoon', 'logo']
2022-03-16 14:42:29,695.695 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['white', 'book', 'table', 'base', 'writing', 'glass', 'newspaper', 'wine', 'plate', 'fork', 'pizza', 'slice']
03-16 14:42:41.837 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 14:42:41.837 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 14:42:43.089 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 14:44:53,140.140 2829:trainer.py:487 do_train_dict(): eta: 20:50:01  iter: 21500  speed: 299.1 images/sec  total_norm: 134.1793 (138.6764)  loss: 149.8346 (152.7388)  masked_loss: 1.6058 (1.6191)  tag_loss: 147.7528 (151.1197)  time: 1.4330 (1.7119)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4279 (1.7068)  save_time: 8.8805 (25.0095)  lr: 0.000068  max mem: 26307
2022-03-16 14:44:53,501.501 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6285714507102966
2022-03-16 14:44:53,501.501 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 158.74705505371094
2022-03-16 14:44:53,502.502 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.04239709288986
2022-03-16 14:45:05,011.011 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018586475402116776
2022-03-16 14:45:05,011.011 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:45:05,012.012 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'bathroom', 'scene', 'looking', 'at', '[MASK]', 'sink', 'and', 'the', 'toilet', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:45:05,027.027 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'toilet', 'bathroom', 'lid', 'seat', 'floor', 'handle', '[UNK]', 'door', 'tank', 'sink', 'rack', 'holder', 'paper', 'bar', 'tile', 'towel', 'cabinet', 'base', 'knob', 'bowl', 'box', 'rod', 'can', 'shelf', 'white', 'light', 'outlet', 'drain', 'pipe', 'curtain', 'ceiling', 'mirror', 'small', 'shower', 'roll', 'bottle', 'water', 'trash', 'reflection', 'vent', 'bag', 'cover', 'window', 'soap', 'drawer', 'rug', 'tissue', 'switch', 'frame']
2022-03-16 14:45:21,006.006 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'top', 'door', 'light', 'wall', 'scene', 'tank', 'handle', 'cabinet', 'bathroom', 'sink', 'towel', 'shelf', 'toilet', 'lid', 'rack']
2022-03-16 14:47:44,505.505 2829:trainer.py:487 do_train_dict(): eta: 20:47:23  iter: 21600  speed: 298.8 images/sec  total_norm: 134.3220 (136.0565)  loss: 150.1140 (151.6686)  masked_loss: 1.5860 (1.6312)  tag_loss: 148.1234 (150.0374)  time: 1.4330 (1.7136)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4278 (1.7085)  save_time: 8.8805 (25.0095)  lr: 0.000067  max mem: 26307
2022-03-16 14:47:44,866.866 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5142857432365417
2022-03-16 14:47:44,866.866 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 161.08016967773438
2022-03-16 14:47:44,866.866 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.06031888970581
2022-03-16 14:47:56,399.399 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018601270392537117
2022-03-16 14:47:56,399.399 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:47:56,400.400 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'efficiency', 'apartment', '[MASK]', 'a', 'living', 'room', ',', 'dining', 'room', 'and', 'kitchen', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:47:56,415.415 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', 'room', 'pillow', 'couch', 'lamp', 'window', 'curtain', 'floor', 'wall', 'shade', 'chair', 'sofa', 'ceiling', 'living', 'coffee', 'leg', 'television', 'cushion', 'carpet', 'picture', '[UNK]', 'door', 'mirror', 'light', 'book', 'end', 'armchair', 'plant', 'vase', 'flower', 'furniture', 'blanket', 'frame', 'drawer', 'vent', 'bowl', 'cabinet', 'rug', 'glass', 'large', 'base', 'desk', 'pot', 'painting', 'top', 'fireplace', 'arm', 'dresser', 'stand', 'shelf']
2022-03-16 14:48:12,368.368 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['room', 'book', 'living', 'floor', 'table', 'wall', 'chair', 'window', 'kitchen', 'picture', 'coffee', 'apartment', 'bowl', 'desk', 'frame', 'plate', 'cabinet', 'ceiling', 'couch', 'flower', 'efficiency', 'shade', 'pillow', 'carpet', 'lamp', 'sofa', 'curtain', 'vase', 'cushion', 'jug']
2022-03-16 14:50:35,859.859 2829:trainer.py:487 do_train_dict(): eta: 20:44:45  iter: 21700  speed: 298.8 images/sec  total_norm: 136.5767 (140.0194)  loss: 150.2529 (151.9859)  masked_loss: 1.6638 (1.6895)  tag_loss: 148.2034 (150.2964)  time: 1.4324 (1.7135)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4272 (1.7084)  save_time: 8.8805 (25.0095)  lr: 0.000067  max mem: 26307
2022-03-16 14:50:36,219.219 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 14:50:36,219.219 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 168.2318878173828
2022-03-16 14:50:36,220.220 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.04952903187603
2022-03-16 14:50:47,674.674 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018620865419507027
2022-03-16 14:50:47,676.676 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:50:47,676.676 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'person', 'and', 'rosario', '##board', 'in', 'the', 'air', 'in', 'an', '[MASK]', 'area', 'with', 'posts', 'and', 'graffiti', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:50:47,691.691 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', '[UNK]', 'arm', 'man', 'ground', 'head', 'leg', 'hat', 'person', 'hand', 'face', 'sign', 'short', 'board', 'floor', 'foot', 'shoe', 'tree', 'wall', 'light', 'boy', 'wheel', 'jean', 'poster', 'building', 'background', 'shadow', 'air', 'trick', 'picture', 'graffiti', 'cap', 'pad', 'logo', 'woman', 'street', 'ramp', 'knee', 'sky', 'pool', 'band', 'belt', 'window', 'skate', 'pole', 'reflection', 'banner', 'line', 'ball', 'ceiling']
2022-03-16 14:51:03,611.611 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'area', 'hand', 'face', 'air', 'building', 'ground', 'person', 'wall', 'arm', 'base', 'window', 'sign', 'shirt', 'picture', 'leg', 'bottle', 'ceiling', 'column', 'hat', 'cap', 'indoor', 'pad', 'pillar', 'graffiti']
2022-03-16 14:53:27,334.334 2829:trainer.py:487 do_train_dict(): eta: 20:42:07  iter: 21800  speed: 298.6 images/sec  total_norm: 134.2628 (138.1374)  loss: 154.3311 (152.8688)  masked_loss: 1.7063 (1.7285)  tag_loss: 152.8481 (151.1403)  time: 1.4334 (1.7148)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4287 (1.7098)  save_time: 8.8805 (25.0095)  lr: 0.000067  max mem: 26307
2022-03-16 14:53:27,696.696 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5833333134651184
2022-03-16 14:53:27,697.697 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 151.42822265625
2022-03-16 14:53:27,697.697 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.0558759959321
2022-03-16 14:53:39,198.198 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018735645338892937
2022-03-16 14:53:39,199.199 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:53:39,200.200 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'person', 'standing', 'on', 'a', 'tennis', 'recipes', 'holding', 'a', 'tennis', 'ra', '##c', '##quet', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:53:39,215.215 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['line', 'shoe', 'fence', 'shirt', 'sock', 'short', 'court', '[UNK]', 'pole', 'tennis', 'leg', 'man', 'tree', 'hand', 'post', 'arm', 'ball', 'head', 'person', 'grass', 'ground', 'hair', 'bush', 'knee', 'boy', 'player', 'top', 'handle', 'air', 'woman', 'face', 'hat', 'leaf', 'dirt', 'sky', 'foot', 'stripe', 'cap', 'bat', 'game', 'young', 'logo', 'roof', 'bench', 'plant', 'tank', 'net', 'match', 'swing', 'wall']
2022-03-16 14:53:55,095.095 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'line', 'court', 'short', 'hair', 'post', 'person', 'arm', 'foot', 'tree', 'shirt', 'leg', 'tennis', 'bush', 'pole', 'fence', 'shoe', 'sock']
2022-03-16 14:56:18,829.829 2829:trainer.py:487 do_train_dict(): eta: 20:39:29  iter: 21900  speed: 298.6 images/sec  total_norm: 135.4611 (138.8950)  loss: 152.2609 (153.3947)  masked_loss: 1.6345 (1.6399)  tag_loss: 150.8832 (151.7547)  time: 1.4332 (1.7149)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4280 (1.7097)  save_time: 8.8805 (25.0095)  lr: 0.000067  max mem: 26307
2022-03-16 14:56:19,188.188 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-16 14:56:19,189.189 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 126.06427001953125
2022-03-16 14:56:19,189.189 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.10279261849143
2022-03-16 14:56:30,828.828 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018765972927212715
2022-03-16 14:56:30,828.828 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:56:30,828.828 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'baseball', 'player', 'holding', 'a', 'bat', 'while', 'standing', '[MASK]', 'a', 'field', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:56:30,844.844 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'helmet', 'field', 'line', 'catcher', 'man', 'shirt', 'grass', 'glove', 'bat', 'wall', 'dirt', 'uniform', 'batter', 'umpire', 'shoe', 'player', 'baseball', 'mask', 'leg', 'shadow', 'plate', 'stand', 'home', 'person', 'head', 'fence', 'game', 'hat', 'ground', 'guard', 'belt', 'ball', 'sign', 'jersey', 'hand', 'advertisement', 'banner', 'shin', 'crowd', 'logo', 'spectator', 'railing', 'ready', 'camera', 'pitch', 'cap', 'number', 'cooler', 'base']
2022-03-16 14:56:46,747.747 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'home', 'line', 'player', 'field', 'person', 'wall', 'base', 'stand', 'baseball', 'shirt', 'leg', 'crowd', 'plate', 'shadow', 'grass', 'belt', 'hat', 'uniform', 'dirt', 'bat', 'mask', 'helmet', 'shoe', 'catcher', 'glove', 'umpire', 'batter']
2022-03-16 14:59:10,228.228 2829:trainer.py:487 do_train_dict(): eta: 20:36:51  iter: 22000  speed: 298.7 images/sec  total_norm: 135.7891 (138.4069)  loss: 153.4985 (153.1872)  masked_loss: 1.6509 (1.6695)  tag_loss: 151.6041 (151.5176)  time: 1.4330 (1.7140)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4277 (1.7089)  save_time: 8.8805 (25.0095)  lr: 0.000067  max mem: 26307
2022-03-16 14:59:10,590.590 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5625
2022-03-16 14:59:10,590.590 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 142.63137817382812
2022-03-16 14:59:10,590.590 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.11265986861147
2022-03-16 14:59:22,343.343 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018753327429294586
2022-03-16 14:59:22,343.343 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 14:59:22,343.343 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'number', 'of', 'people', '[MASK]', 'a', 'beach', 'with', 'many', '[MASK]', '##s', 'flying', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 14:59:22,359.359 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['kite', 'sky', 'person', 'sand', 'mountain', 'beach', 'cloud', 'short', 'string', 'hill', 'man', 'shirt', 'water', 'woman', '[UNK]', 'air', 'hat', 'bag', 'building', 'house', 'tail', 'tent', 'head', 'couple', 'wave', 'chair', 'leg', 'hair', 'tree', 'group', 'parachute', 'umbrella', 'rope', 'logo', 'shadow', 'child', 'top', 'day', 'sandy', 'boy', 'foot', 'balloon', 'footprint', 'blanket', 'grass', 'boat', 'flag', 'bikini', 'ground', 'ocean']
2022-03-16 14:59:38,391.391 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'many', 'man', 'number', 'air', 'short', 'ground', 'person', 'hill', 'mountain', 'beach', 'sky', 'shirt', 'string', 'sand', 'cloud', 'kite']
2022-03-16 15:02:01,945.945 2829:trainer.py:487 do_train_dict(): eta: 20:34:13  iter: 22100  speed: 298.2 images/sec  total_norm: 134.0390 (137.1120)  loss: 150.2495 (151.1519)  masked_loss: 1.6509 (1.6876)  tag_loss: 148.2164 (149.4642)  time: 1.4338 (1.7172)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4287 (1.7121)  save_time: 8.8805 (25.0095)  lr: 0.000067  max mem: 26307
2022-03-16 15:02:02,305.305 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5714285969734192
2022-03-16 15:02:02,306.306 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 134.83810424804688
2022-03-16 15:02:02,306.306 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.12876385181873
2022-03-16 15:02:13,944.944 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018755590543150902
2022-03-16 15:02:13,945.945 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:02:13,945.945 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', '[MASK]', 'is', 'interested', 'in', 'what', 'is', '[MASK]', 'on', 'his', 'phone', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:02:13,960.960 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'man', 'sky', 'hand', 'tree', 'railing', 'hair', 'head', 'phone', 'ear', 'water', 'short', 'grass', 'fence', '[UNK]', 'foil', 'eye', 'face', 'floor', 'nose', 'cell', 'rail', 'person', 'collar', 'mouth', 'pole', 'balcony', 'post', 'reflection', 'building', 'arm', 'camera', 'table', 'bush', 'park', 'leg', 'deck', 'handle', 'chair', 'finger', 'bench', 'porch', 'bottle', 'next', 'window', 'leaf', 'sidewalk', 'white', 'metal', 'top']
2022-03-16 15:02:29,833.833 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'face', 'water', 'short', 'hair', 'floor', 'phone', 'tree', 'sky', 'shirt', 'ear', 'interested', 'porch', 'fence', 'railing', 'foil']
2022-03-16 15:04:53,700.700 2829:trainer.py:487 do_train_dict(): eta: 20:31:35  iter: 22200  speed: 298.1 images/sec  total_norm: 135.2420 (137.9736)  loss: 148.1449 (148.5095)  masked_loss: 1.6983 (1.6648)  tag_loss: 145.9914 (146.8447)  time: 1.4336 (1.7176)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4285 (1.7121)  save_time: 8.8805 (25.0095)  lr: 0.000067  max mem: 26307
2022-03-16 15:04:54,061.061 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5151515007019043
2022-03-16 15:04:54,062.062 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 135.98159790039062
2022-03-16 15:04:54,062.062 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.14292232017345
2022-03-16 15:05:05,908.908 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018771588802337646
2022-03-16 15:05:05,909.909 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:05:05,909.909 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'pup', '##pies', 'playing', 'in', 'the', 'green', 'grass', 'of', 'their', 'yard', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:05:05,924.924 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'dog', 'collar', 'leg', 'tail', 'head', 'ear', '[UNK]', 'neck', 'paw', 'mouth', 'field', 'face', 'foot', 'nose', 'eye', 'spot', 'fence', 'ground', 'person', 'grassy', 'back', 'air', 'white', 'tag', 'green', 'wall', 'black', 'bush', 'fur', 'leaf', 'flower', 'legs', 'pole', 'tree', 'group', 'trunk', 'small', 'body', 'patch', 'dirt', 'park', 'top', 'chest', 'other', 'house', 'yard', 'hair', 'post', 'toy']
2022-03-16 15:05:21,862.862 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'face', 'green', 'mouth', 'neck', 'dog', 'leg', 'yard', 'ear', 'grass', 'tail', 'collar', 'paw']
2022-03-16 15:07:45,421.421 2829:trainer.py:487 do_train_dict(): eta: 20:28:57  iter: 22300  speed: 298.2 images/sec  total_norm: 137.2633 (138.9839)  loss: 149.4451 (149.7311)  masked_loss: 1.5724 (1.6184)  tag_loss: 147.5350 (148.1127)  time: 1.4330 (1.7172)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4279 (1.7122)  save_time: 8.8805 (25.0095)  lr: 0.000066  max mem: 26307
2022-03-16 15:07:45,782.782 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.44117647409439087
2022-03-16 15:07:45,782.782 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 149.8798828125
2022-03-16 15:07:45,782.782 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.1530785730907
2022-03-16 15:07:57,543.543 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018815239891409874
2022-03-16 15:07:57,543.543 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:07:57,544.544 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'hitter', 'prepares', 'to', 'get', '[MASK]', 'the', 'batter', '##s', 'box', 'for', 'the', 'pitch', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:07:57,559.559 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'stand', 'shirt', 'man', '[UNK]', 'helmet', 'seat', 'shoe', 'woman', 'game', 'hat', 'person', 'player', 'stadium', 'field', 'number', 'uniform', 'catcher', 'baseball', 'line', 'bat', 'glove', 'umpire', 'jersey', 'cap', 'stair', 'chair', 'mask', 'grass', 'dirt', 'spectator', 'fence', 'head', 'batter', 'hair', 'hand', 'sunglasses', 'boy', 'bag', 'glasses', 'leg', 'barrier', 'ground', 'railing', 'step', 'ball', 'logo', 'belt', 'plate', 'base']
2022-03-16 15:08:13,476.476 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'number', 'game', 'line', 'player', 'woman', 'field', 'person', 'child', 'wall', 'seat', 'stand', 'chair', 'stadium', 'baseball', 'ball', 'shirt', 'jersey', 'bag', 'hat', 'uniform', 'pitch', 'bat', 'mask', 'glasses', 'helmet', 'shoe', 'catcher', 'glove', 'hitter', 'umpire', 'spectator', 'stair']
2022-03-16 15:10:37,248.248 2829:trainer.py:487 do_train_dict(): eta: 20:26:19  iter: 22400  speed: 298.0 images/sec  total_norm: 135.1457 (138.7956)  loss: 151.7771 (151.7263)  masked_loss: 1.5879 (1.6285)  tag_loss: 150.1920 (150.0979)  time: 1.4338 (1.7183)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4286 (1.7132)  save_time: 8.8805 (25.0095)  lr: 0.000066  max mem: 26307
2022-03-16 15:10:37,608.608 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 15:10:37,608.608 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.73355102539062
2022-03-16 15:10:37,609.609 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.16920547485351
2022-03-16 15:10:49,522.522 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01884065568447113
2022-03-16 15:10:49,522.522 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:10:49,522.522 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'astronomer', 'baseball', 'player', 'is', 'successful', 'in', 'his', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:10:49,538.538 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', '[UNK]', 'sky', 'shoe', 'man', 'bat', 'dirt', 'pole', 'baseball', 'helmet', 'uniform', 'fence', 'player', 'belt', 'plate', 'person', 'leg', 'grass', 'field', 'arm', 'ground', 'batter', 'home', 'hand', 'bridge', 'head', 'building', 'line', 'glove', 'base', 'game', 'light', 'tree', 'ball', 'catcher', 'jersey', 'sign', 'umpire', 'cloud', 'boy', 'hat', 'foot', 'roof', 'short', 'ready', 'tower', 'swing', 'bench', 'mask', 'number']
2022-03-16 15:11:05,559.559 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'home', 'hand', 'player', 'field', 'ground', 'person', 'bridge', 'successful', 'stadium', 'tree', 'attempt', 'baseball', 'ball', 'sign', 'sky', 'shirt', 'plate', 'grass', 'belt', 'uniform', 'pole', 'dirt', 'bat', 'wire', 'logo', 'fence', 'helmet', 'shoe']
03-16 15:12:43.178 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 15:12:43.178 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 15:12:44.354 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 96}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 97}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 15:13:28,953.953 2829:trainer.py:487 do_train_dict(): eta: 20:23:41  iter: 22500  speed: 298.2 images/sec  total_norm: 137.7139 (139.5286)  loss: 151.3708 (150.3748)  masked_loss: 1.6112 (1.6567)  tag_loss: 149.7876 (148.7181)  time: 1.4333 (1.7171)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4281 (1.7120)  save_time: 8.8805 (25.0095)  lr: 0.000066  max mem: 26307
2022-03-16 15:13:29,315.315 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7142857313156128
2022-03-16 15:13:29,315.315 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 161.44058227539062
2022-03-16 15:13:29,315.315 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.17193963042402
2022-03-16 15:13:41,138.138 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01885174959897995
2022-03-16 15:13:41,139.139 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:13:41,139.139 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'plate', '[MASK]', 'rolls', 'and', 'some', '[MASK]', '##tu', '##ce', 'with', 'a', '[MASK]', 'on', 'the', 'side', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:13:41,154.154 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', '[UNK]', 'bread', 'sandwich', 'plate', 'salad', 'glass', 'food', 'bowl', 'cup', 'tray', 'fork', 'napkin', 'person', 'handle', 'container', 'knife', 'bottle', 'leaf', 'wall', 'chair', 'water', 'bun', 'meat', 'tomato', 'floor', 'basket', 'spoon', 'liquid', 'lid', 'carrot', 'hand', 'label', 'window', 'top', 'paper', 'lemon', 'ground', 'green', 'onion', 'shirt', 'shadow', 'wine', 'pepper', 'drink', 'flower', 'tile', 'light', 'logo', 'arm']
2022-03-16 15:13:57,079.079 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'side', 'top', 'table', 'food', 'chair', 'drink', 'handle', 'plate', 'bottle', 'liquid', 'bread', 'fork', 'sandwich', 'container', 'lid', 'jar', 'salad', 'napkin']
2022-03-16 15:16:20,846.846 2829:trainer.py:487 do_train_dict(): eta: 20:21:03  iter: 22600  speed: 297.9 images/sec  total_norm: 139.6925 (141.5385)  loss: 151.0261 (151.1999)  masked_loss: 1.7646 (1.7453)  tag_loss: 149.5672 (149.4547)  time: 1.4341 (1.7189)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0047)  time_gpu: 1.4289 (1.7140)  save_time: 8.8805 (25.0095)  lr: 0.000066  max mem: 26307
2022-03-16 15:16:21,206.206 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5142857432365417
2022-03-16 15:16:21,207.207 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.1366424560547
2022-03-16 15:16:21,207.207 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.16016181239998
2022-03-16 15:16:33,063.063 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01883060298860073
2022-03-16 15:16:33,063.063 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:16:33,064.064 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'couple', 'of', 'kids', '[MASK]', 'are', 'in', 'some', 'bags', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:16:33,079.079 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['girl', '[UNK]', 'shirt', 'wall', 'hair', 'hand', 'head', 'arm', 'leg', 'eye', 'nose', 'face', 'floor', 'child', 'shoe', 'short', 'fireplace', 'bed', 'food', 'foot', 'sock', 'ear', 'young', 'chair', 'book', 'stripe', 'table', 'pizza', 'boy', 'mouth', 'ponytail', 'plate', 'bag', 'pillow', 'blanket', 'room', 'woman', 'couch', 'box', 'strap', 'carpet', 'hat', 'little', 'picture', 'top', 'fire', 'knee', 'person', 'paper', 'toy']
2022-03-16 15:16:48,919.919 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'woman', 'hair', 'girl', 'person', 'child', 'bed', 'wall', 'arm', 'couple', 'baby', 'shirt', 'bag', 'ear', 'tag', 'pizza', 'suitcase']
2022-03-16 15:19:12,955.955 2829:trainer.py:487 do_train_dict(): eta: 20:18:25  iter: 22700  speed: 297.5 images/sec  total_norm: 134.8037 (137.6708)  loss: 151.8196 (153.4212)  masked_loss: 1.5916 (1.6368)  tag_loss: 150.1186 (151.7844)  time: 1.4348 (1.7211)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4296 (1.7159)  save_time: 8.8805 (25.0095)  lr: 0.000066  max mem: 26307
2022-03-16 15:19:13,317.317 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4117647111415863
2022-03-16 15:19:13,318.318 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.55197143554688
2022-03-16 15:19:13,318.318 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.16097405082301
2022-03-16 15:19:25,344.344 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0188747625797987
2022-03-16 15:19:25,344.344 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:19:25,345.345 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'person', '[MASK]', 'a', 'dirt', 'bike', 'on', 'a', '[MASK]', 'trail', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:19:25,360.360 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['motorcycle', 'dirt', 'helmet', 'man', 'boot', 'tire', 'bike', 'ground', 'sky', 'person', '[UNK]', 'wheel', 'grass', 'shirt', 'head', 'pole', 'hand', 'cone', 'field', 'leg', 'jacket', 'track', 'vest', 'glove', 'fence', 'rider', 'flag', 'mud', 'fender', 'sign', 'rope', 'arm', 'foot', 'uniform', 'face', 'number', 'tree', 'road', 'banner', 'race', 'post', 'barrier', 'spoke', 'background', 'cloud', 'outfit', 'hill', 'course', 'building', 'shadow']
2022-03-16 15:19:41,194.194 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'ground', 'track', 'person', 'arm', 'sky', 'shirt', 'leg', 'trail', 'chain', 'wheel', 'grass', 'stick', 'pole', 'jacket', 'dirt', 'glasses', 'bike', 'mud', 'fence', 'barrier', 'motorcycle', 'boot', 'helmet', 'tire', 'muddy', 'glove', 'vest']
2022-03-16 15:22:04,842.842 2829:trainer.py:487 do_train_dict(): eta: 20:15:46  iter: 22800  speed: 297.9 images/sec  total_norm: 135.3317 (139.6346)  loss: 149.1247 (151.6351)  masked_loss: 1.6151 (1.6486)  tag_loss: 147.5096 (149.9866)  time: 1.4324 (1.7189)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4271 (1.7137)  save_time: 8.8805 (25.0095)  lr: 0.000066  max mem: 26307
2022-03-16 15:22:05,203.203 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6571428775787354
2022-03-16 15:22:05,203.203 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 159.89584350585938
2022-03-16 15:22:05,203.203 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.16715342092722
2022-03-16 15:22:17,211.211 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018884718418121338
2022-03-16 15:22:17,211.211 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:22:17,211.211 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'there', 'is', 'engulfed', 'pine', '##apple', 'besides', 'a', 'plate', '[MASK]', 'orange', '##s', 'and', 'a', 'small', 'bowl', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:22:17,227.227 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['orange', 'bowl', '[UNK]', 'wall', 'table', 'plant', 'cup', 'fruit', 'leaf', 'stem', 'plate', 'pot', 'container', 'vase', 'red', 'top', 'ground', 'bucket', 'cloth', 'flower', 'apple', 'basket', 'tray', 'straw', 'door', 'bunch', 'mat', 'display', 'bowls', 'handle', 'lid', 'brick', 'paint', 'floor', 'glass', 'design', 'next', 'base', 'candle', 'banana', 'paper', 'small', 'mug', 'lemon', 'carpet', 'sign', 'jar', 'fresh', 'stick', 'stand']
2022-03-16 15:22:33,177.177 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'small', 'cup', 'ground', 'table', 'wall', 'plant', 'orange', 'bowl', 'plate', 'fruit', 'leaf', 'stem', 'pot', 'container', 'straw', 'bucket']
2022-03-16 15:24:56,941.941 2829:trainer.py:487 do_train_dict(): eta: 20:13:08  iter: 22900  speed: 297.5 images/sec  total_norm: 134.7267 (137.5499)  loss: 150.8632 (150.9382)  masked_loss: 1.6942 (1.6851)  tag_loss: 149.1250 (149.2531)  time: 1.4336 (1.7210)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4287 (1.7159)  save_time: 8.8805 (25.0095)  lr: 0.000066  max mem: 26307
2022-03-16 15:24:57,302.302 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.39393940567970276
2022-03-16 15:24:57,303.303 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 175.9125213623047
2022-03-16 15:24:57,303.303 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.15778335903002
2022-03-16 15:25:09,416.416 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018883822485804558
2022-03-16 15:25:09,416.416 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:25:09,417.417 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'cat', 'is', 'sitting', 'on', 'top', '[MASK]', 'the', 'refrigerator', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:25:09,432.432 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['refrigerator', 'door', 'window', 'wall', 'handle', '[UNK]', 'pen', 'head', 'cap', 'container', 'ear', 'magnet', 'kitchen', 'paper', 'nose', 'table', 'bottle', 'picture', 'eye', 'face', 'grass', 'curtain', 'cat', 'cabinet', 'train', 'box', 'tree', 'ceiling', 'floor', 'chair', 'mouth', 'frame', 'light', 'shirt', 'hair', 'wire', 'person', 'top', 'pencil', 'lid', 'shelf', 'bag', 'towel', 'cup', 'car', 'microwave', 'woman', 'hand', 'reflection', 'cord']
2022-03-16 15:25:25,413.413 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'head', 'room', 'top', 'book', 'door', 'light', 'wall', 'eye', 'paper', 'window', 'box', 'kitchen', 'picture', 'leg', 'ear', 'cat', 'mirror', 'grass', 'ceiling', 'cap', 'basket', 'curtain', 'container', 'lid', 'magnet', 'refrigerator', 'paw']
2022-03-16 15:27:48,964.964 2829:trainer.py:487 do_train_dict(): eta: 20:10:30  iter: 23000  speed: 297.6 images/sec  total_norm: 136.8162 (142.2599)  loss: 148.6096 (147.6655)  masked_loss: 1.6150 (1.6340)  tag_loss: 146.6157 (146.0314)  time: 1.4332 (1.7203)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4280 (1.7151)  save_time: 8.8805 (25.0095)  lr: 0.000065  max mem: 26307
2022-03-16 15:27:49,325.325 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 15:27:49,326.326 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 161.98287963867188
2022-03-16 15:27:49,326.326 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.14104453413002
2022-03-16 15:28:01,390.390 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01887005940079689
2022-03-16 15:28:01,390.390 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:28:01,391.391 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'young', 'kid', '[MASK]', 'laying', 'down', 'and', 'reading', 'a', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:28:01,406.406 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'arm', 'bed', 'book', 'hair', 'head', 'face', 'pillow', 'girl', 'nose', 'blanket', 'eye', 'leg', 'toy', 'shirt', 'bear', '[UNK]', 'boy', 'person', 'ear', 'wall', 'animal', 'woman', 'mouth', 'finger', 'teddy', 'stuffed', 'foot', 'short', 'paw', 'box', 'picture', 'child', 'doll', 'dog', 'young', 'sheet', 'glasses', 'design', 'apple', 'bag', 'bird', 'logo', 'writing', 'dot', 'laptop', 'flower', 'ball', 'floor', 'table']
2022-03-16 15:28:17,280.280 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['hand', 'book', 'young', 'hair', 'girl', 'person', 'bed', 'arm', 'shirt', 'animal', 'finger', 'ear', 'kid', 'logo', 'blanket', 'toy', 'pillow', 'curtain', 'stuffed']
2022-03-16 15:30:41,093.093 2829:trainer.py:487 do_train_dict(): eta: 20:07:52  iter: 23100  speed: 297.5 images/sec  total_norm: 137.7201 (138.7487)  loss: 153.3822 (150.9167)  masked_loss: 1.6256 (1.6272)  tag_loss: 151.5968 (149.2896)  time: 1.4327 (1.7212)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4274 (1.7160)  save_time: 8.8805 (25.0095)  lr: 0.000065  max mem: 26307
2022-03-16 15:30:41,454.454 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 15:30:41,454.454 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 128.96005249023438
2022-03-16 15:30:41,454.454 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.16326327159487
2022-03-16 15:30:53,664.664 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018871325999498367
2022-03-16 15:30:53,664.664 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:30:53,665.665 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'herd', 'of', 'sheep', 'near', 'a', 'barn', '[MASK]', 'a', 'mountain', '.', 'kills', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:30:53,680.680 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'sky', 'grass', 'barn', 'fence', 'hill', 'sheep', 'field', 'cloud', 'mountain', 'roof', 'building', 'animal', 'herd', 'farm', '[UNK]', 'dirt', 'post', 'house', 'pasture', 'horse', 'grazing', 'cow', 'background', 'rock', 'person', 'grassy', 'head', 'pole', 'green', 'lush', 'ground', 'bush', 'hillside', 'large', 'lamb', 'area', 'open', 'group', 'gate', 'shed', 'dog', 'car', 'door', 'leaf', 'flower', 'shadow', 'goat', 'hay', 'snow']
2022-03-16 15:31:09,591.591 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['building', 'field', 'hill', 'mountain', 'tree', 'sky', 'animal', 'roof', 'grass', 'cloud', 'sheep', 'fence', 'barn', 'herd']
2022-03-16 15:33:33,041.041 2829:trainer.py:487 do_train_dict(): eta: 20:05:13  iter: 23200  speed: 297.8 images/sec  total_norm: 135.5694 (138.0753)  loss: 148.0987 (150.2050)  masked_loss: 1.6260 (1.6843)  tag_loss: 146.9210 (148.5208)  time: 1.4329 (1.7195)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4276 (1.7144)  save_time: 8.8805 (25.0095)  lr: 0.000065  max mem: 26307
2022-03-16 15:33:33,402.402 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-16 15:33:33,403.403 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 138.2624053955078
2022-03-16 15:33:33,403.403 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.16169995811364
2022-03-16 15:33:45,711.711 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018899137154221535
2022-03-16 15:33:45,711.711 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:33:45,712.712 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'cattle', 'being', 'herd', '##ed', 'down', 'a', 'trail', 'with', '[MASK]', '[MASK]', 'the', 'distance', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:33:45,727.727 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'field', 'grass', 'sky', 'cow', 'water', 'mountain', 'post', 'fence', 'cloud', 'hill', 'river', 'pasture', 'pole', 'head', 'herd', 'leg', 'dirt', '[UNK]', 'bush', 'pond', 'animal', 'smoke', 'road', 'tail', 'group', 'cattle', 'grassy', 'horse', 'path', 'fog', 'distance', 'stream', 'area', 'forest', 'ear', 'lush', 'background', 'rock', 'large', 'person', 'building', 'top', 'ground', 'open', 'next', 'brown', 'shadow', 'green', 'dog']
2022-03-16 15:34:01,721.721 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'water', 'river', 'field', 'post', 'hill', 'mountain', 'distance', 'tree', 'sky', 'leg', 'trail', 'grass', 'tail', 'cloud', 'dirt', 'fence', 'pond', 'cow', 'pasture']
2022-03-16 15:36:25,157.157 2829:trainer.py:487 do_train_dict(): eta: 20:02:35  iter: 23300  speed: 297.5 images/sec  total_norm: 138.9259 (140.1018)  loss: 149.3187 (149.4137)  masked_loss: 1.6085 (1.6140)  tag_loss: 147.7715 (147.7998)  time: 1.4331 (1.7211)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4280 (1.7157)  save_time: 8.8805 (25.0095)  lr: 0.000065  max mem: 26307
2022-03-16 15:36:25,520.520 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 15:36:25,520.520 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 138.348876953125
2022-03-16 15:36:25,520.520 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.17541662036864
2022-03-16 15:36:37,742.742 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018946433439850807
2022-03-16 15:36:37,742.742 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:36:37,742.742 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'view', 'of', 'a', 'dining', 'room', 'with', 'a', 'chandler', 'above', 'the', 'table', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:36:37,758.758 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['chair', 'wall', 'table', 'window', '[UNK]', 'light', 'floor', 'door', 'picture', 'room', 'ceiling', 'curtain', 'glass', 'plant', 'rug', 'bottle', 'vase', 'kitchen', 'dining', 'paper', 'book', 'towel', 'shelf', 'cloth', 'plate', 'candle', 'pot', 'flower', 'napkin', 'stool', 'cabinet', 'blind', 'cushion', 'tile', 'bar', 'basket', 'lamp', 'tray', 'cup', 'coffee', 'area', 'holder', 'switch', 'phone', 'bowl', 'counter', 'mat', 'couch', 'clock', 'fixture']
2022-03-16 15:36:53,673.673 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'room', 'door', 'light', 'floor', 'table', 'wall', 'view', 'glass', 'chair', 'plant', 'window', 'picture', 'wine', 'fan', 'bottle', 'ceiling', 'flower', 'dining', 'lamp', 'curtain', 'shelf', 'outlet', 'chandler', 'jar', 'vase', 'rug']
2022-03-16 15:39:17,450.450 2829:trainer.py:487 do_train_dict(): eta: 19:59:56  iter: 23400  speed: 297.2 images/sec  total_norm: 137.9162 (140.4553)  loss: 149.7691 (150.4321)  masked_loss: 1.5975 (1.5901)  tag_loss: 148.1763 (148.8421)  time: 1.4351 (1.7230)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4299 (1.7177)  save_time: 8.8805 (25.0095)  lr: 0.000065  max mem: 26307
2022-03-16 15:39:17,810.810 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 15:39:17,811.811 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 159.7098388671875
2022-03-16 15:39:17,811.811 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.16895571769552
2022-03-16 15:39:30,042.042 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018980173394083977
2022-03-16 15:39:30,043.043 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:39:30,043.043 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'display', 'in', 'a', '[MASK]', 'filled', 'with', 'lots', 'of', 'fresh', 'produce', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:39:30,059.059 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'woman', 'ceiling', 'person', 'wall', '[UNK]', 'chair', 'food', 'glasses', 'short', 'light', 'fruit', 'man', 'refrigerator', 'banana', 'shoe', 'hair', 'floor', 'shelf', 'building', 'table', 'top', 'picture', 'window', 'stand', 'bowl', 'ground', 'case', 'sign', 'bottle', 'bag', 'box', 'cooler', 'tank', 'jean', 'display', 'grape', 'poster', 'door', 'cart', 'tray', 'fan', 'stool', 'hat', 'pastry', 'glass', 'shop', 'apple', 'container', 'lady']
2022-03-16 15:39:46,100.100 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'building', 'top', 'light', 'woman', 'short', 'case', 'ground', 'hair', 'girl', 'person', 'floor', 'table', 'wall', 'food', 'chair', 'bar', 'box', 'sign', 'shirt', 'picture', 'produce', 'scale', 'drink', 'bowl', 'display', 'restaurant', 'fresh', 'tank', 'bottle', 'ceiling', 'fruit', 'hat', 'cap', 'pole', 'glasses', 'rod', 'basket', 'lid', 'poster', 'banana', 'refrigerator']
2022-03-16 15:42:09,707.707 2829:trainer.py:487 do_train_dict(): eta: 19:57:18  iter: 23500  speed: 297.2 images/sec  total_norm: 141.4391 (143.7444)  loss: 150.1758 (150.0126)  masked_loss: 1.6447 (1.6607)  tag_loss: 148.4586 (148.3519)  time: 1.4334 (1.7225)  data: 0.0002 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4283 (1.7176)  save_time: 8.8805 (25.0095)  lr: 0.000065  max mem: 26307
2022-03-16 15:42:10,070.070 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.59375
2022-03-16 15:42:10,070.070 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 134.4872283935547
2022-03-16 15:42:10,070.070 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.18996980634786
2022-03-16 15:42:22,513.513 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018972747027873993
2022-03-16 15:42:22,513.513 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:42:22,513.513 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', 'sheep', 'grazing', 'on', 'a', 'lush', 'green', 'hillside', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:42:22,528.528 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['ear', 'leg', 'sheep', 'grass', 'head', 'tree', 'nose', 'field', 'eye', 'sky', 'road', 'face', '[UNK]', 'shadow', 'fence', 'wool', 'pole', 'background', 'lamb', 'mouth', 'building', 'hill', 'cloud', 'house', 'green', 'grassy', 'foot', 'white', 'mountain', 'car', 'standing', 'path', 'bridge', 'dirt', 'sign', 'bush', 'body', 'tail', 'next', 'line', 'baby', 'paint', 'tag', 'front', 'ground', 'roof', 'animal', 'camera', 'window', 'other']
2022-03-16 15:42:38,475.475 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'face', 'road', 'field', 'green', 'mouth', 'hill', 'eye', 'tree', 'sky', 'leg', 'nose', 'ear', 'grass', 'tail', 'sheep', 'fence', 'wool', 'lamb', 'herd', 'grazing', 'lush', 'hillside']
03-16 15:42:44.453 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 15:42:44.453 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 15:42:45.636 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 88}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}]
2022-03-16 15:45:02,088.088 2829:trainer.py:487 do_train_dict(): eta: 19:54:40  iter: 23600  speed: 297.0 images/sec  total_norm: 141.3566 (144.9451)  loss: 147.9414 (147.7448)  masked_loss: 1.5325 (1.5703)  tag_loss: 146.0698 (146.1745)  time: 1.4329 (1.7239)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4278 (1.7189)  save_time: 8.8805 (25.0095)  lr: 0.000064  max mem: 26307
2022-03-16 15:45:02,447.447 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.42424243688583374
2022-03-16 15:45:02,448.448 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 138.03775024414062
2022-03-16 15:45:02,448.448 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.20385904754768
2022-03-16 15:45:14,942.942 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01897869072854519
2022-03-16 15:45:14,942.942 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:45:14,942.942 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'motor', '##cy', '##cl', '##ist', 'is', 'happy', 'composite', 'be', 'on', 'the', 'road', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:45:14,958.958 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'man', 'motorcycle', 'water', 'mountain', 'road', 'bike', 'bridge', '[UNK]', 'hill', 'curb', 'bush', 'jacket', 'tire', 'ground', 'head', 'grass', 'tree', 'rock', 'wheel', 'mirror', 'line', 'pole', 'hand', 'background', 'shadow', 'windshield', 'shirt', 'hair', 'helmet', 'light', 'pipe', 'sidewalk', 'face', 'structure', 'field', 'seat', 'ocean', 'dirt', 'building', 'person', 'street', 'distance', 'barrier', 'front', 'tower', 'sunglasses', 'plate', 'box', 'leg']
2022-03-16 15:45:30,842.842 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'water', 'road', 'light', 'field', 'ground', 'rock', 'hill', 'bridge', 'mountain', 'distance', 'tree', 'happy', 'box', 'sky', 'shirt', 'shadow', 'wheel', 'grass', 'bush', 'pole', 'jacket', 'dirt', 'bike', 'motorcycle', 'helmet', 'tire', 'curb', 'windshield']
2022-03-16 15:47:54,396.396 2829:trainer.py:487 do_train_dict(): eta: 19:52:01  iter: 23700  speed: 297.1 images/sec  total_norm: 138.1699 (141.0114)  loss: 149.3079 (149.9088)  masked_loss: 1.6291 (1.6486)  tag_loss: 147.3706 (148.2602)  time: 1.4332 (1.7231)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4279 (1.7179)  save_time: 8.8805 (25.0095)  lr: 0.000064  max mem: 26307
2022-03-16 15:47:54,756.756 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-16 15:47:54,756.756 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 136.33474731445312
2022-03-16 15:47:54,756.756 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.21426976628665
2022-03-16 15:48:07,142.142 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018957236781716347
2022-03-16 15:48:07,143.143 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:48:07,143.143 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'art', '##isan', 'pizza', '[MASK]', 'sit', 'cooked', 'and', 'ready', 'to', 'eat', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:48:07,158.158 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['pizza', 'plate', 'table', 'cheese', 'food', '[UNK]', 'crust', 'bowl', 'slice', 'hand', 'tray', 'person', 'dish', 'onion', 'tomato', 'spoon', 'sauce', 'shrimp', 'knife', 'pepper', 'cloth', 'handle', 'napkin', 'fork', 'topping', 'finger', 'pan', 'different', 'top', 'glass', 'delicious', 'small', 'fry', 'pea', 'cup', 'large', 'bread', 'cutter', 'pie', 'shirt', 'arm', 'close', 'wooden', 'rice', 'white', 'ready', 'background', 'fries', 'french', 'ring']
2022-03-16 15:48:22,985.985 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'person', 'table', 'arm', 'phone', 'ready', 'sit', 'plate', 'cheese', 'pizza', 'shrimp', 'napkin', 'onion']
2022-03-16 15:50:46,901.901 2829:trainer.py:487 do_train_dict(): eta: 19:49:23  iter: 23800  speed: 296.8 images/sec  total_norm: 136.3897 (138.5133)  loss: 148.4475 (151.0699)  masked_loss: 1.6305 (1.6223)  tag_loss: 147.2752 (149.4476)  time: 1.4336 (1.7250)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4284 (1.7199)  save_time: 8.8805 (25.0095)  lr: 0.000064  max mem: 26307
2022-03-16 15:50:47,261.261 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5151515007019043
2022-03-16 15:50:47,262.262 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 153.40493774414062
2022-03-16 15:50:47,262.262 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.21598830282937
2022-03-16 15:50:59,841.841 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018976902589201927
2022-03-16 15:50:59,841.841 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:50:59,842.842 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'there', '[MASK]', 'a', 'man', '[MASK]', 'through', 'the', 'snow', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:50:59,857.857 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', '[UNK]', 'snow', 'ground', 'ski', 'pole', 'sunglasses', 'man', 'trunk', 'jacket', 'track', 'glove', 'hair', 'hand', 'person', 'head', 'coat', 'branch', 'face', 'glasses', 'boot', 'shadow', 'leg', 'skier', 'foot', 'sky', 'slope', 'shoe', 'stick', 'snowy', 'poles', 'hat', 'bush', 'shirt', 'hill', 'country', 'woman', 'arm', 'cross', 'flag', 'house', 'sign', 'base', 'leaf', 'skiing', 'wood', 'building', 'trail', 'path', 'green']
2022-03-16 15:51:15,751.751 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'ground', 'hair', 'track', 'tree', 'sky', 'leg', 'snow', 'shadow', 'flag', 'coat', 'pole', 'jacket', 'glasses', 'trunk', 'ski', 'boot', 'shoe', 'knot', 'glove', 'sunglasses']
2022-03-16 15:53:39,358.358 2829:trainer.py:487 do_train_dict(): eta: 19:46:44  iter: 23900  speed: 296.9 images/sec  total_norm: 136.4052 (143.0688)  loss: 149.8061 (150.1617)  masked_loss: 1.6536 (1.6525)  tag_loss: 148.2077 (148.5092)  time: 1.4324 (1.7246)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4272 (1.7195)  save_time: 8.8805 (25.0095)  lr: 0.000064  max mem: 26307
2022-03-16 15:53:39,718.718 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.53125
2022-03-16 15:53:39,718.718 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 134.28103637695312
2022-03-16 15:53:39,719.719 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.2244913260142
2022-03-16 15:53:52,212.212 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018963953480124474
2022-03-16 15:53:52,212.212 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:53:52,213.213 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'riding', 'a', 'horse', 'in', 'a', 'field', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:53:52,228.228 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'grass', 'trunk', 'person', 'tail', 'horse', 'shadow', 'park', 'shirt', 'branch', 'path', '[UNK]', 'ground', 'green', 'man', 'head', 'field', 'building', 'leg', 'woman', 'leaf', 'fence', 'vest', 'road', 'rock', 'house', 'hill', 'pole', 'roof', 'helmet', 'wall', 'hat', 'jacket', 'pathway', 'post', 'saddle', 'track', 'boot', 'front', 'top', 'line', 'rider', 'sign', 'dirt', 'bench', 'couple', 'grassy', 'sky', 'area', 'large']
2022-03-16 15:54:08,186.186 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['park', 'woman', 'field', 'person', 'tree', 'horse', 'branch', 'shirt', 'path', 'shadow', 'grass', 'tail', 'flower', 'trunk']
2022-03-16 15:56:31,880.880 2829:trainer.py:487 do_train_dict(): eta: 19:44:05  iter: 24000  speed: 296.8 images/sec  total_norm: 134.1169 (136.3411)  loss: 147.1749 (148.6082)  masked_loss: 1.6555 (1.6598)  tag_loss: 145.7497 (146.9485)  time: 1.4327 (1.7252)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4276 (1.7201)  save_time: 8.8805 (25.0095)  lr: 0.000064  max mem: 26307
2022-03-16 15:56:32,241.241 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-16 15:56:32,241.241 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 169.04132080078125
2022-03-16 15:56:32,241.241 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.21855190482871
2022-03-16 15:56:44,962.962 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018973737955093384
2022-03-16 15:56:44,963.963 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:56:44,963.963 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'person', 'that', '[MASK]', '[MASK]', 'a', 'wine', 'glass', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:56:44,979.979 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'glass', 'woman', 'wall', 'hand', 'wine', 'chair', 'nose', 'painting', 'ear', 'eye', 'head', 'face', 'arm', 'picture', 'shirt', '[UNK]', 'box', 'table', 'bottle', 'top', 'watch', 'plate', 'strap', 'frame', 'mouth', 'man', 'tank', 'person', 'dress', 'girl', 'necklace', 'napkin', 'ring', 'wrist', 'finger', 'water', 'smile', 'bowl', 'label', 'knife', 'bracelet', 'purse', 'light', 'window', 'lid', 'restaurant', 'paper', 'curtain', 'door']
2022-03-16 15:57:00,917.917 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'face', 'water', 'top', 'woman', 'hair', 'person', 'table', 'wall', 'arm', 'glass', 'eye', 'chair', 'paper', 'box', 'shirt', 'label', 'picture', 'painting', 'finger', 'dress', 'nose', 'wine', 'ear', 'frame', 'bottle', 'cap', 'flower', 'glasses', 'pitcher', 'pot', 'shelf', 'rack', 'wallet', 'strap']
2022-03-16 15:59:24,489.489 2829:trainer.py:487 do_train_dict(): eta: 19:41:27  iter: 24100  speed: 296.6 images/sec  total_norm: 133.4590 (136.2929)  loss: 147.5028 (147.9796)  masked_loss: 1.5983 (1.6425)  tag_loss: 145.7913 (146.3371)  time: 1.4327 (1.7262)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4277 (1.7211)  save_time: 8.8805 (25.0095)  lr: 0.000064  max mem: 26307
2022-03-16 15:59:24,849.849 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.53125
2022-03-16 15:59:24,849.849 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 167.29139709472656
2022-03-16 15:59:24,850.850 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.2074146270752
2022-03-16 15:59:37,589.589 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01893622614443302
2022-03-16 15:59:37,590.590 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 15:59:37,590.590 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'laying', 'flat', 'on', 'a', 'surf', '##board', 'and', 'riding', 'a', 'wave', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 15:59:37,606.606 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['water', '[UNK]', 'wave', 'head', 'hair', 'hand', 'board', 'man', 'arm', 'foot', 'nose', 'face', 'suit', 'ear', 'surfer', 'mouth', 'sleeve', 'leg', 'logo', 'wet', 'person', 'boy', 'cord', 'eye', 'ocean', 'surf', 'watch', 'top', 'shirt', 'hat', 'dog', 'stripe', 'jacket', 'rope', 'reflection', 'shoe', 'design', 'black', 'handle', 'woman', 'short', 'strap', 'vest', 'fin', 'leash', 'girl', 'ankle', 'helmet', 'writing', 'body']
2022-03-16 15:59:53,489.489 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'water', 'board', 'hair', 'mouth', 'eye', 'leg', 'flat', 'wave', 'nose', 'ear', 'bubble', 'surfer']
2022-03-16 16:02:16,962.962 2829:trainer.py:487 do_train_dict(): eta: 19:38:48  iter: 24200  speed: 296.9 images/sec  total_norm: 135.6889 (138.5408)  loss: 148.0948 (150.4723)  masked_loss: 1.5965 (1.6018)  tag_loss: 146.6483 (148.8705)  time: 1.4314 (1.7247)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4265 (1.7196)  save_time: 8.8805 (25.0095)  lr: 0.000064  max mem: 26307
2022-03-16 16:02:17,324.324 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6153846383094788
2022-03-16 16:02:17,325.325 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.43109130859375
2022-03-16 16:02:17,325.325 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.22078709543487
2022-03-16 16:02:29,983.983 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01891171932220459
2022-03-16 16:02:29,983.983 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:02:29,984.984 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'person', '[MASK]', 'on', '[MASK]', 'different', 'colored', 'fire', 'hydra', '##nts', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:02:29,999.999 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'shoe', 'fire', 'leg', 'ground', 'person', 'man', 'top', 'foot', 'cap', 'bolt', 'road', 'shirt', 'wall', 'sidewalk', 'line', 'hand', 'shadow', 'arm', 'red', 'photo', 'bottom', 'picture', 'knob', 'different', 'base', 'green', 'floor', 'head', 'stripe', 'next', 'jean', 'couple', 'black', 'logo', 'chain', 'lid', 'hair', 'side', 'yellow', 'street', 'image', 'white', 'woman', 'sock', 'jacket', 'number', 'face', 'close', 'wheel']
2022-03-16 16:02:45,902.902 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'top', 'different', 'fire', 'ground', 'person', 'foot', 'leg', 'cap', 'shoe', 'sidewalk']
2022-03-16 16:05:09,504.504 2829:trainer.py:487 do_train_dict(): eta: 19:36:09  iter: 24300  speed: 296.7 images/sec  total_norm: 136.3977 (138.6405)  loss: 153.6107 (152.7312)  masked_loss: 1.5811 (1.5897)  tag_loss: 152.2676 (151.1415)  time: 1.4326 (1.7254)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4272 (1.7202)  save_time: 8.8805 (25.0095)  lr: 0.000063  max mem: 26307
2022-03-16 16:05:09,866.866 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-16 16:05:09,866.866 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 137.2525634765625
2022-03-16 16:05:09,866.866 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.23538722366582
2022-03-16 16:05:22,628.628 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018954303115606308
2022-03-16 16:05:22,628.628 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:05:22,629.629 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'small', '[MASK]', 'with', 'bed', 'and', 'other', 'furniture', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:05:22,644.644 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['floor', 'room', 'wall', 'bed', 'leg', 'pillow', 'blanket', 'chair', 'tile', 'table', 'bag', 'cushion', '[UNK]', 'cord', 'book', 'clothes', 'bedroom', 'fan', 'paper', 'couch', 'sheet', 'magazine', 'desk', 'clothing', 'stand', 'nightstand', 'seat', 'box', 'handle', 'backpack', 'door', 'window', 'stool', 'can', 'top', 'picture', 'bottle', 'small', 'outlet', 'mattress', 'wire', 'wheel', 'towel', 'shirt', 'carpet', 'furniture', 'lamp', 'hat', 'remote', 'jacket']
2022-03-16 16:05:38,720.720 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['other', 'small', 'room', 'book', 'floor', 'bed', 'table', 'wall', 'seat', 'base', 'magazine', 'cover', 'glass', 'chair', 'leg', 'bedroom', 'plate', 'mirror', 'fan', 'bottle', 'ceiling', 'cap', 'sheet', 'furniture', 'blanket', 'pillow', 'cord', 'outlet', 'candle', 'tile', 'rack', 'stool', 'cushion']
2022-03-16 16:08:02,470.470 2829:trainer.py:487 do_train_dict(): eta: 19:33:31  iter: 24400  speed: 296.0 images/sec  total_norm: 137.0949 (139.7642)  loss: 149.7769 (151.7343)  masked_loss: 1.5293 (1.5569)  tag_loss: 148.3535 (150.1774)  time: 1.4349 (1.7297)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0051)  time_gpu: 1.4297 (1.7241)  save_time: 8.8805 (25.0095)  lr: 0.000063  max mem: 26307
2022-03-16 16:08:02,832.832 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.3636363744735718
2022-03-16 16:08:02,832.832 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 138.1591033935547
2022-03-16 16:08:02,833.833 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.24375985593213
2022-03-16 16:08:15,657.657 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018912089988589287
2022-03-16 16:08:15,658.658 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:08:15,658.658 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'security', 'officer', 'has', '[MASK]', 'dog', 'searching', 'luggage', '[MASK]', 'the', 'airport', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:08:15,673.673 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['suitcase', 'floor', 'shirt', 'man', '[UNK]', 'hair', 'dog', 'ceiling', 'airport', 'luggage', 'hand', 'sign', 'shoe', 'belt', 'light', 'head', 'pillar', 'column', 'arm', 'handle', 'wall', 'person', 'tag', 'bag', 'tile', 'suit', 'building', 'ear', 'leg', 'line', 'ground', 'cart', 'pole', 'sleeve', 'wheel', 'briefcase', 'collar', 'glass', 'case', 'woman', 'terminal', 'door', 'backpack', 'chair', 'boot', 'vest', 'glasses', 'buckle', 'room', 'leash']
2022-03-16 16:08:31,550.550 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'light', 'woman', 'hair', 'person', 'floor', 'officer', 'security', 'airport', 'metal', 'sign', 'shirt', 'dog', 'handle', 'wheel', 'cabinet', 'belt', 'ceiling', 'column', 'panel', 'shoe', 'cart', 'pillar', 'suitcase', 'luggage', 'briefcase', 'buckle']
2022-03-16 16:10:55,067.067 2829:trainer.py:487 do_train_dict(): eta: 19:30:52  iter: 24500  speed: 296.6 images/sec  total_norm: 138.0531 (142.2440)  loss: 148.4807 (148.6174)  masked_loss: 1.5692 (1.5915)  tag_loss: 147.0728 (147.0259)  time: 1.4327 (1.7259)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4275 (1.7208)  save_time: 8.8805 (25.0095)  lr: 0.000063  max mem: 26307
2022-03-16 16:10:55,429.429 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5151515007019043
2022-03-16 16:10:55,430.430 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 169.04324340820312
2022-03-16 16:10:55,430.430 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.24308158130181
2022-03-16 16:11:08,381.381 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018916206434369087
2022-03-16 16:11:08,381.381 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:11:08,381.381 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'motor', '##cy', '##cl', '##ist', 'takes', '[MASK]', 'turn', 'in', '[MASK]', 'of', 'a', 'crowd', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:11:08,397.397 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['motorcycle', 'man', 'person', 'bike', 'road', 'tire', 'tree', 'building', 'jacket', 'helmet', 'hat', '[UNK]', 'curb', 'street', 'wheel', 'sidewalk', 'shirt', 'number', 'sign', 'arm', 'head', 'photo', 'short', 'pole', 'window', 'roof', 'shadow', 'trunk', 'car', 'ground', 'crowd', 'woman', 'wall', 'sunglasses', 'house', 'door', 'rock', 'hand', 'bicycle', 'black', 'cap', 'boot', 'white', 'statue', 'light', 'boy', 'group', 'shoe', 'glove', 'fence']
2022-03-16 16:11:24,320.320 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'building', 'road', 'front', 'street', 'woman', 'short', 'car', 'person', 'turn', 'arm', 'window', 'tree', 'sign', 'shirt', 'crowd', 'roof', 'hat', 'pole', 'jacket', 'bike', 'motorcycle', 'banner', 'helmet', 'sidewalk', 'tire', 'curb']
03-16 16:12:45.674 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 16:12:45.674 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 16:12:46.866 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 16:13:47,949.949 2829:trainer.py:487 do_train_dict(): eta: 19:28:13  iter: 24600  speed: 296.2 images/sec  total_norm: 139.0278 (141.9289)  loss: 152.9365 (153.4747)  masked_loss: 1.6869 (1.6635)  tag_loss: 151.3011 (151.8112)  time: 1.4325 (1.7288)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4274 (1.7237)  save_time: 8.8805 (25.0095)  lr: 0.000063  max mem: 26307
2022-03-16 16:13:48,311.311 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.625
2022-03-16 16:13:48,312.312 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 165.33758544921875
2022-03-16 16:13:48,312.312 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.24130127014901
2022-03-16 16:14:01,288.288 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018921079114079475
2022-03-16 16:14:01,289.289 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:14:01,289.289 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'driving', '##fully', 'horse', 'pulled', 'wagon', 'with', 'a', 'cow', 'in', 'back', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:14:01,304.304 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'head', 'cart', 'grass', 'cow', 'wheel', 'tire', 'mountain', 'wagon', 'road', 'hat', 'rock', 'back', 'leg', 'ground', 'tail', 'shirt', 'number', 'person', '[UNK]', 'helmet', 'bull', 'animal', 'jacket', 'shadow', 'hill', 'horse', 'plate', 'harness', 'ear', 'license', 'wall', 'water', 'carriage', 'gravel', 'rope', 'truck', 'sign', 'cattle', 'horn', 'snow', 'trailer', 'drawn', 'wood', 'pole', 'bench', 'bush', 'old', 'dirt', 'tree']
2022-03-16 16:14:17,198.198 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['back', 'head', 'man', 'number', 'water', 'road', 'ground', 'rock', 'person', 'wall', 'mountain', 'wood', 'horse', 'leg', 'bell', 'plate', 'shadow', 'wheel', 'grass', 'tail', 'hat', 'license', 'jacket', 'wagon', 'helmet', 'cart', 'cow', 'tire', 'harness', 'paw', 'puddle']
2022-03-16 16:16:40,946.946 2829:trainer.py:487 do_train_dict(): eta: 19:25:35  iter: 24700  speed: 296.0 images/sec  total_norm: 136.1129 (140.0984)  loss: 148.6623 (146.9752)  masked_loss: 1.4780 (1.5283)  tag_loss: 146.6696 (145.4469)  time: 1.4341 (1.7300)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4288 (1.7248)  save_time: 8.8805 (25.0095)  lr: 0.000063  max mem: 26307
2022-03-16 16:16:41,306.306 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7575757503509521
2022-03-16 16:16:41,307.307 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.0332489013672
2022-03-16 16:16:41,307.307 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.24007534211681
2022-03-16 16:16:54,178.178 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019005320966243744
2022-03-16 16:16:54,178.178 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:16:54,179.179 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', 'orange', 'piece', 'yellowstone', 'luggage', 'sitting', 'next', 'to', '[MASK]', 'light', 'pole', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:16:54,194.194 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'sky', 'window', 'cloud', 'lamp', 'light', 'tree', 'sign', 'street', 'pole', 'sidewalk', 'roof', '[UNK]', 'post', 'suitcase', 'person', 'bag', 'shadow', 'city', 'road', 'wall', 'ground', 'fence', 'man', 'luggage', 'shirt', 'handle', 'plant', 'car', 'door', 'woman', 'hair', 'fire', 'box', 'sunglasses', 'orange', 'hand', 'chimney', 'base', 'jacket', 'flower', 'can', 'next', 'grass', 'jean', 'arm', 'yellow', 'head', 'brick', 'backpack']
2022-03-16 16:17:10,123.123 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['next', 'building', 'street', 'light', 'ground', 'post', 'person', 'wall', 'plant', 'window', 'tree', 'piece', 'sign', 'sky', 'roof', 'orange', 'cloud', 'pole', 'lamp', 'sidewalk', 'luggage']
2022-03-16 16:19:34,007.007 2829:trainer.py:487 do_train_dict(): eta: 19:22:56  iter: 24800  speed: 295.9 images/sec  total_norm: 135.8421 (138.9159)  loss: 147.8478 (150.6669)  masked_loss: 1.6022 (1.6336)  tag_loss: 146.6045 (149.0333)  time: 1.4335 (1.7306)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4284 (1.7254)  save_time: 8.8805 (25.0095)  lr: 0.000063  max mem: 26307
2022-03-16 16:19:34,368.368 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-16 16:19:34,369.369 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 147.7824249267578
2022-03-16 16:19:34,369.369 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.2547194239605
2022-03-16 16:19:47,437.437 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01897704228758812
2022-03-16 16:19:47,437.437 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:19:47,437.437 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'officers', 'are', 'standing', 'and', 'on', '##₇', 'to', 'form', 'a', 'line', 'across', 'a', 'town', '[MASK]', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:19:47,453.453 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'horse', 'person', 'man', 'building', 'pole', 'helmet', 'jacket', 'street', '[UNK]', 'sidewalk', 'bus', 'city', 'hat', 'child', 'sky', 'light', 'sign', 'line', 'ground', 'uniform', 'car', 'road', 'girl', 'boot', 'leg', 'bag', 'jean', 'boy', 'brick', 'policeman', 'group', 'shoe', 'police', 'woman', 'head', 'window', 'flag', 'vest', 'parade', 'coat', 'traffic', 'backpack', 'wall', 'post', 'officer', 'shirt', 'face', 'cover', 'tire']
2022-03-16 16:20:03,499.499 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'head', 'man', 'town', 'line', 'building', 'street', 'center', 'person', 'officer', 'van', 'window', 'tree', 'horse', 'sign', 'shirt', 'bus', 'flag', 'brick', 'hat', 'pole', 'jacket', 'parade', 'helmet', 'backpack']
2022-03-16 16:22:27,091.091 2829:trainer.py:487 do_train_dict(): eta: 19:20:18  iter: 24900  speed: 295.8 images/sec  total_norm: 138.9472 (141.9700)  loss: 148.8446 (149.5296)  masked_loss: 1.6578 (1.6754)  tag_loss: 147.0405 (147.8542)  time: 1.4338 (1.7309)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4288 (1.7258)  save_time: 8.8805 (25.0095)  lr: 0.000063  max mem: 26307
2022-03-16 16:22:27,453.453 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.625
2022-03-16 16:22:27,453.453 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 131.02325439453125
2022-03-16 16:22:27,454.454 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.26436616516114
2022-03-16 16:22:40,550.550 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01898140087723732
2022-03-16 16:22:40,550.550 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:22:40,551.551 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'fig', '##uri', '##nes', 'of', 'humans', 'and', 'animals', 'dressed', 'as', 'humans', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:22:40,566.566 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'sign', 'doll', 'bear', 'hat', 'letter', 'chair', 'head', 'fence', 'cat', 'word', '[UNK]', 'animal', 'reflection', 'table', 'curtain', 'glass', 'display', 'shirt', 'dog', 'ear', 'hair', 'cage', 'man', 'pole', 'skull', 'umbrella', 'stuffed', 'toy', 'wood', 'wall', 'teddy', 'statue', 'paper', 'paw', 'person', 'cloth', 'frame', 'arm', 'clothes', 'logo', 'dress', 'leg', 'face', 'top', 'nose', 'fur', 'woman', 'flag', 'jacket']
2022-03-16 16:22:56,405.405 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hair', 'word', 'table', 'glass', 'window', 'letter', 'sign', 'dog', 'animal', 'dress', 'bear', 'cat', 'hat', 'cap', 'skull', 'cage', 'fence', 'toy', 'reflection', 'doll', 'kitten']
2022-03-16 16:25:20,265.265 2829:trainer.py:487 do_train_dict(): eta: 19:17:39  iter: 25000  speed: 295.7 images/sec  total_norm: 135.7363 (140.2340)  loss: 148.0405 (148.2562)  masked_loss: 1.6266 (1.6477)  tag_loss: 146.2447 (146.6085)  time: 1.4332 (1.7317)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4281 (1.7266)  save_time: 8.8805 (25.0095)  lr: 0.000062  max mem: 26307
2022-03-16 16:25:20,267.267 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt
2022-03-16 16:25:29,357.357 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-16 16:25:29,358.358 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 158.1492156982422
2022-03-16 16:25:29,358.358 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.27328960829047
2022-03-16 16:25:42,439.439 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01900436170399189
2022-03-16 16:25:42,440.440 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:25:42,440.440 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'yellow', 'trolley', 'passing', 'by', 'street', 'intersection', '[MASK]', 'night', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:25:42,455.455 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['pole', 'street', 'light', 'window', 'sidewalk', 'road', 'sign', 'car', 'bus', 'traffic', 'curb', 'line', 'door', 'building', '[UNK]', 'poster', 'wall', 'tree', 'night', 'graffiti', 'train', 'pillar', 'stop', 'wheel', 'man', 'person', 'picture', 'city', 'arrow', 'tire', 'fire', 'trolley', 'column', 'base', 'wire', 'ceiling', 'van', 'sky', 'front', 'intersection', 'bridge', 'windshield', 'ground', 'post', 'booth', 'back', 'corner', 'letter', 'box', 'shirt']
2022-03-16 16:25:58,226.226 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['can', 'line', 'night', 'building', 'door', 'road', 'street', 'light', 'car', 'ground', 'person', 'wall', 'van', 'window', 'train', 'sign', 'yellow', 'bus', 'traffic', 'passing', 'ceiling', 'pole', 'intersection', 'sidewalk', 'trolley']
2022-03-16 16:28:21,141.141 2829:trainer.py:487 do_train_dict(): eta: 19:15:13  iter: 25100  speed: 283.1 images/sec  total_norm: 139.7350 (140.6100)  loss: 149.4645 (151.5770)  masked_loss: 1.5557 (1.5728)  tag_loss: 148.0794 (150.0042)  time: 1.4327 (1.8087)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4275 (1.7163)  save_time: 8.8805 (21.7526)  lr: 0.000062  max mem: 26307
2022-03-16 16:28:21,503.503 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 16:28:21,503.503 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 167.398681640625
2022-03-16 16:28:21,503.503 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.2692554564703
2022-03-16 16:28:34,490.490 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018990544602274895
2022-03-16 16:28:34,490.490 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:28:34,490.490 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'elephants', 'partially', '[MASK]', 'in', 'a', 'body', 'of', 'water', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:28:34,506.506 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['elephant', 'water', 'trunk', 'grass', 'ear', '[UNK]', 'sky', 'head', 'leg', 'eye', 'body', 'ripple', 'name', 'skin', 'tail', 'mouth', 'tree', 'shore', 'foot', 'face', 'back', 'river', 'writing', 'logo', 'land', 'reflection', 'background', 'branch', 'next', 'large', 'other', 'splash', 'shadow', 'bank', 'field', 'wave', 'couple', 'bird', 'baby', 'hair', 'horn', 'ground', 'hole', 'plant', 'rock', 'standing', 'small', 'line', 'bush', 'number']
2022-03-16 16:28:50,422.422 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'name', 'water', 'body', 'skin', 'eye', 'sky', 'leg', 'wave', 'ear', 'grass', 'logo', 'trunk', 'elephant', 'ripple']
2022-03-16 16:31:15,575.575 2829:trainer.py:487 do_train_dict(): eta: 19:12:36  iter: 25200  speed: 293.5 images/sec  total_norm: 137.7838 (139.7146)  loss: 146.6917 (148.7538)  masked_loss: 1.5849 (1.6268)  tag_loss: 145.3175 (147.1270)  time: 1.4328 (1.7443)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4276 (1.7392)  save_time: 8.8805 (21.7526)  lr: 0.000062  max mem: 26307
2022-03-16 16:31:15,936.936 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-16 16:31:15,936.936 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 132.5997772216797
2022-03-16 16:31:15,936.936 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.2923086565945
2022-03-16 16:31:29,139.139 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01899297535419464
2022-03-16 16:31:29,140.140 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:31:29,140.140 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'people', 'riding', '[MASK]', 'down', '[MASK]', 'wooded', 'trail', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:31:29,156.156 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['head', 'helmet', 'boot', 'horse', 'jacket', '[UNK]', 'tree', 'face', 'foot', 'man', 'leg', 'vest', 'ground', 'person', 'trail', 'branch', 'path', 'bush', 'nose', 'grass', 'ear', 'woman', 'forest', 'hat', 'glove', 'dirt', 'saddle', 'eye', 'road', 'shirt', 'mane', 'patch', 'stripe', 'girl', 'hair', 'brown', 'scarf', 'coat', 'glasses', 'hand', 'sky', 'jean', 'wood', 'boy', 'horseback', 'chain', 'rider', 'tail', 'rock', 'plant']
2022-03-16 16:31:45,076.076 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'woman', 'ground', 'person', 'forest', 'eye', 'foot', 'tree', 'horse', 'path', 'leg', 'trail', 'nose', 'grass', 'bush', 'hat', 'jacket', 'glasses', 'boot', 'helmet', 'saddle', 'wooded', 'vest', 'mane']
2022-03-16 16:34:08,874.874 2829:trainer.py:487 do_train_dict(): eta: 19:09:57  iter: 25300  speed: 295.4 images/sec  total_norm: 135.1420 (140.3543)  loss: 150.6504 (150.7531)  masked_loss: 1.6197 (1.6218)  tag_loss: 148.7565 (149.1314)  time: 1.4338 (1.7330)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4288 (1.7278)  save_time: 8.8805 (21.7526)  lr: 0.000062  max mem: 26307
2022-03-16 16:34:09,235.235 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6571428775787354
2022-03-16 16:34:09,236.236 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.993896484375
2022-03-16 16:34:09,236.236 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.29301742493637
2022-03-16 16:34:22,331.331 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.018994001671671867
2022-03-16 16:34:22,331.331 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:34:22,332.332 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'white', 'cow', 'standing', 'on', 'a', 'sandy', '[MASK]', '##ener', 'boats', 'in', 'the', 'background', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:34:22,347.347 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['ear', 'sky', 'eye', 'boat', 'sand', 'nose', 'head', 'tree', 'leg', 'beach', 'cow', 'person', 'water', 'shadow', 'mountain', 'rock', 'rope', 'hill', 'face', 'ocean', '[UNK]', 'ground', 'tail', 'building', 'body', 'horn', 'man', 'mouth', 'cloud', 'animal', 'hair', 'sandy', 'background', 'neck', 'wave', 'house', 'shore', 'bird', 'collar', 'child', 'distance', 'couple', 'woman', 'string', 'palm', 'shirt', 'roof', 'footprint', 'umbrella', 'short']
2022-03-16 16:34:38,282.282 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'water', 'white', 'rock', 'person', 'child', 'wall', 'mountain', 'eye', 'tree', 'beach', 'sky', 'boat', 'ocean', 'leg', 'background', 'wave', 'nose', 'ear', 'shadow', 'flag', 'palm', 'sand', 'tail', 'cloud', 'sandy', 'rope', 'cow']
2022-03-16 16:37:02,282.282 2829:trainer.py:487 do_train_dict(): eta: 19:07:18  iter: 25400  speed: 295.3 images/sec  total_norm: 134.5178 (136.1719)  loss: 145.9712 (147.0637)  masked_loss: 1.5456 (1.6194)  tag_loss: 144.7548 (145.4443)  time: 1.4349 (1.7341)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4296 (1.7289)  save_time: 8.8805 (21.7526)  lr: 0.000062  max mem: 26307
2022-03-16 16:37:02,642.642 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-16 16:37:02,643.643 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 149.9478302001953
2022-03-16 16:37:02,643.643 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.30016636567957
2022-03-16 16:37:15,837.837 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01898724026978016
2022-03-16 16:37:15,837.837 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:37:15,838.838 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'barefoot', 'man', 'sitting', 'on', 'one', 'of', '[MASK]', 'red', 'chairs', 'reading', 'his', 'phone', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:37:15,853.853 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'shirt', 'ground', 'jean', 'hair', 'glasses', 'hand', 'chair', 'shoe', 'phone', 'leg', 'head', '[UNK]', 'shadow', 'table', 'person', 'ear', 'sidewalk', 'face', 'stool', 'flower', 'cell', 'wall', 'bag', 'foot', 'sock', 'watch', 'stand', 'bench', 'arm', 'sunglasses', 'woman', 'jacket', 'logo', 'short', 'pole', 'backpack', 'top', 'sign', 'stripe', 'dirt', 'number', 'cup', 'next', 'writing', 'window', 'nose', 'letter', 'bottle', 'tie']
2022-03-16 16:37:31,797.797 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'several', 'red', 'ground', 'hair', 'person', 'table', 'phone', 'chair', 'foot', 'jean', 'shirt', 'leg', 'bag', 'ear', 'shadow', 'glasses', 'shoe', 'stool', 'sunglasses']
2022-03-16 16:39:55,773.773 2829:trainer.py:487 do_train_dict(): eta: 19:04:40  iter: 25500  speed: 295.1 images/sec  total_norm: 137.1805 (140.4316)  loss: 148.2104 (147.2036)  masked_loss: 1.5100 (1.5815)  tag_loss: 145.9834 (145.6221)  time: 1.4338 (1.7349)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4285 (1.7294)  save_time: 8.8805 (21.7526)  lr: 0.000062  max mem: 26307
2022-03-16 16:39:56,136.136 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4000000059604645
2022-03-16 16:39:56,136.136 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.21771240234375
2022-03-16 16:39:56,137.137 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.3150619417429
2022-03-16 16:40:09,359.359 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01905658282339573
2022-03-16 16:40:09,359.359 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:40:09,359.359 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'doll', 'with', '[MASK]', 'large', 'head', 'next', 'to', 'a', 'banana', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:40:09,375.375 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['paper', 'desk', 'book', 'table', 'pen', '[UNK]', 'mouse', 'light', 'keyboard', 'computer', 'wall', 'cord', 'button', 'cup', 'ceiling', 'monitor', 'floor', 'wire', 'screen', 'bag', 'box', 'office', 'phone', 'handle', 'key', 'top', 'chair', 'room', 'pile', 'pad', 'laptop', 'container', 'shelf', 'reflection', 'pencil', 'label', 'umbrella', 'picture', 'window', 'man', 'magazine', 'folder', 'logo', 'cell', 'black', 'cap', 'open', 'next', 'notebook', 'white']
2022-03-16 16:40:25,230.230 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'large', 'hair', 'mouth', 'floor', 'table', 'wall', 'arm', 'eye', 'paper', 'clothes', 'dress', 'flower', 'bow', 'wire', 'doll', 'shoe', 'cord', 'banana', 'sock']
03-16 16:42:46.936 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 16:42:46.937 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 16:42:48.291 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 16:42:49,334.334 2829:trainer.py:487 do_train_dict(): eta: 19:02:01  iter: 25600  speed: 295.0 images/sec  total_norm: 139.8911 (142.9466)  loss: 146.4619 (148.1393)  masked_loss: 1.5929 (1.6069)  tag_loss: 144.8118 (146.5324)  time: 1.4344 (1.7356)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4292 (1.7304)  save_time: 8.8805 (21.7526)  lr: 0.000061  max mem: 26307
2022-03-16 16:42:49,696.696 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-16 16:42:49,696.696 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 134.64019775390625
2022-03-16 16:42:49,696.696 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.32209862438157
2022-03-16 16:43:02,977.977 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019058486446738243
2022-03-16 16:43:02,977.977 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:43:02,978.978 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'guy', 'is', 'taking', 'a', 'picture', 'of', 'himself', 'holding', 'a', 'phone', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:43:02,993.993 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'hat', 'head', 'hand', 'logo', 'building', 'grass', 'person', '[UNK]', 'phone', 'screen', 'jacket', 'camera', 'window', 'mouth', 'face', 'wall', 'picture', 'cap', 'television', 'glasses', 'reflection', 'shirt', 'speaker', 'cell', 'light', 'nose', 'table', 'pole', 'coat', 'handle', 'front', 'button', 'sunglasses', 'sign', 'top', 'ceiling', 'tie', 'woman', 'arm', 'sky', 'black', 'chair', 'bag', 'suit', 'tree', 'ground', 'scarf', 'book', 'finger']
2022-03-16 16:43:18,930.930 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'building', 'light', 'cup', 'mouth', 'wall', 'phone', 'guy', 'chair', 'window', 'sign', 'picture', 'nose', 'camera', 'coat', 'grass', 'hat', 'cap', 'jacket', 'glasses', 'logo', 'sunglasses']
2022-03-16 16:45:42,649.649 2829:trainer.py:487 do_train_dict(): eta: 18:59:22  iter: 25700  speed: 295.4 images/sec  total_norm: 137.5378 (141.7697)  loss: 144.8826 (148.2045)  masked_loss: 1.6210 (1.6339)  tag_loss: 143.1542 (146.5706)  time: 1.4324 (1.7332)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4272 (1.7280)  save_time: 8.8805 (21.7526)  lr: 0.000061  max mem: 26307
2022-03-16 16:45:43,010.010 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 16:45:43,010.010 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.68502807617188
2022-03-16 16:45:43,010.010 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.3220149639041
2022-03-16 16:45:57,129.129 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019044969230890274
2022-03-16 16:45:57,129.129 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:45:57,129.129 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'person', 'taking', 'a', 'picture', 'of', 'a', '[MASK]', 'vanity', '[MASK]', 'sink', 'in', 'a', '[MASK]', 'mirror', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:45:57,144.144 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'wall', 'mirror', 'towel', 'sink', 'bathroom', 'shirt', 'floor', 'bottle', 'man', 'person', 'door', 'arm', 'hand', 'tray', 'toilet', 'head', 'light', 'rack', 'tile', 'reflection', 'handle', 'hair', 'soap', 'short', 'tissue', 'shelf', 'drain', 'dish', 'rug', 'tank', 'box', 'woman', 'paper', 'counter', 'tub', 'top', 'outlet', 'glass', 'board', 'leg', 'cup', 'large', 'picture', 'vanity', 'sign', 'plate', 'napkin', 'pipe', 'shower']
2022-03-16 16:46:13,167.167 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'large', 'door', 'woman', 'person', 'floor', 'wall', 'arm', 'watch', 'box', 'shirt', 'picture', 'camera', 'mirror', 'bathroom', 'bottle', 'sink', 'purse', 'towel', 'toilet', 'outlet', 'tile', 'vanity', 'vent']
2022-03-16 16:48:36,777.777 2829:trainer.py:487 do_train_dict(): eta: 18:56:43  iter: 25800  speed: 294.0 images/sec  total_norm: 135.0892 (137.9039)  loss: 147.6626 (148.3559)  masked_loss: 1.5781 (1.5925)  tag_loss: 145.9246 (146.7634)  time: 1.4326 (1.7413)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4275 (1.7360)  save_time: 8.8805 (21.7526)  lr: 0.000061  max mem: 26307
2022-03-16 16:48:37,139.139 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-16 16:48:37,139.139 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.58230590820312
2022-03-16 16:48:37,139.139 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.33457571199041
2022-03-16 16:48:50,509.509 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019095269963145256
2022-03-16 16:48:50,509.509 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:48:50,510.510 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'holding', 'a', 'tennis', 'ra', '##c', '##quet', 'on', 'a', 'court', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:48:50,525.525 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['leg', '[UNK]', 'line', 'short', 'sock', 'hand', 'court', 'man', 'shoe', 'shirt', 'tennis', 'arm', 'hair', 'ball', 'band', 'head', 'player', 'handle', 'ground', 'shadow', 'logo', 'foot', 'wrist', 'face', 'male', 'knee', 'stripe', 'sleeve', 'blue', 'person', 'glove', 'ankle', 'orange', 'green', 'yellow', 'grass', 'watch', 'ready', 'ear', 'match', 'young', 'serve', 'action', 'collar', 'swing', 'bat', 'calf', 'black', 'hat', 'string']
2022-03-16 16:49:06,351.351 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'band', 'court', 'short', 'hair', 'arm', 'ball', 'shirt', 'leg', 'handle', 'tennis', 'shadow', 'shoe', 'sock']
2022-03-16 16:51:30,243.243 2829:trainer.py:487 do_train_dict(): eta: 18:54:04  iter: 25900  speed: 295.2 images/sec  total_norm: 137.7176 (139.7167)  loss: 147.9422 (148.7543)  masked_loss: 1.6580 (1.6801)  tag_loss: 146.2596 (147.0742)  time: 1.4331 (1.7346)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4277 (1.7294)  save_time: 8.8805 (21.7526)  lr: 0.000061  max mem: 26307
2022-03-16 16:51:30,605.605 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5625
2022-03-16 16:51:30,606.606 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 131.23956298828125
2022-03-16 16:51:30,606.606 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.36111357762263
2022-03-16 16:51:44,148.148 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019090967252850533
2022-03-16 16:51:44,148.148 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:51:44,149.149 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'airplanes', 'flying', 'in', 'a', '[MASK]', 'in', 'the', 'sky', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:51:44,164.164 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['airplane', 'sky', 'smoke', 'wing', 'trail', 'jet', 'tail', 'cloud', 'formation', 'blue', '[UNK]', 'line', 'group', 'nose', 'stream', 'air', 'aircraft', 'fighter', 'body', 'tree', 'plane', 'high', 'overhead', 'small', 'stripe', 'cockpit', 'front', 'squadron', 'engine', 'red', 'view', 'white', 'back', 'fuselage', 'day', 'other', 'vapor', 'top', 'light', 'clear', 'large', 'tank', 'writing', 'couple', 'window', 'letter', 'logo', 'fin', 'clouds', 'different']
2022-03-16 16:52:00,083.083 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'wing', 'sky', 'formation', 'trail', 'smoke', 'tail', 'jet', 'airplane']
2022-03-16 16:54:23,962.962 2829:trainer.py:487 do_train_dict(): eta: 18:51:25  iter: 26000  speed: 294.7 images/sec  total_norm: 138.6972 (142.5949)  loss: 148.2448 (149.9707)  masked_loss: 1.6097 (1.6407)  tag_loss: 146.6478 (148.3300)  time: 1.4338 (1.7372)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4286 (1.7321)  save_time: 8.8805 (21.7526)  lr: 0.000061  max mem: 26307
2022-03-16 16:54:24,323.323 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-16 16:54:24,324.324 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 119.13093566894531
2022-03-16 16:54:24,324.324 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.37203218014304
2022-03-16 16:54:37,998.998 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019138723611831665
2022-03-16 16:54:37,998.998 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:54:37,998.998 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'zebra', 'with', 'brown', '[MASK]', 'is', 'grazing', 'in', 'a', '[MASK]', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:54:38,014.014 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['leg', 'field', 'tail', 'grass', 'zebra', 'head', 'sky', 'ear', 'bush', 'stripe', 'background', 'tree', 'mane', 'hill', 'mountain', '[UNK]', 'eye', 'leaf', 'neck', 'flower', 'nose', 'mouth', 'hair', 'animal', 'ground', 'green', 'face', 'open', 'dirt', 'grassy', 'horn', 'rock', 'back', 'grazing', 'next', 'cow', 'trunk', 'lush', 'other', 'standing', 'deer', 'area', 'distance', 'body', 'cloud', 'top', 'foot', 'large', 'wild', 'young']
2022-03-16 16:54:53,959.959 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'field', 'ground', 'hair', 'green', 'mouth', 'brown', 'hill', 'mountain', 'eye', 'neck', 'tree', 'sky', 'leg', 'background', 'nose', 'ear', 'grass', 'tail', 'bush', 'flower', 'leaf', 'stripe', 'mane', 'zebra']
2022-03-16 16:57:17,605.605 2829:trainer.py:487 do_train_dict(): eta: 18:48:46  iter: 26100  speed: 294.9 images/sec  total_norm: 140.6286 (143.1858)  loss: 149.3145 (148.3190)  masked_loss: 1.6170 (1.6073)  tag_loss: 147.3820 (146.7116)  time: 1.4340 (1.7365)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4289 (1.7314)  save_time: 8.8805 (21.7526)  lr: 0.000061  max mem: 26307
2022-03-16 16:57:17,968.968 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-16 16:57:17,968.968 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 161.69021606445312
2022-03-16 16:57:17,968.968 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.36007316239917
2022-03-16 16:57:31,614.614 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01917995512485504
2022-03-16 16:57:31,614.614 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 16:57:31,614.614 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'on', 'a', 'red', 'motorcycle', 'during', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 16:57:31,630.630 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['motorcycle', 'man', 'tire', 'bike', 'helmet', 'tree', 'road', 'bush', 'wall', 'glove', 'wheel', 'person', '[UNK]', 'grass', 'boot', 'leg', 'jacket', 'ground', 'number', 'fence', 'arm', 'dirt', 'curb', 'suit', 'stripe', 'pole', 'post', 'line', 'hand', 'windshield', 'shoe', 'shirt', 'logo', 'foot', 'leaf', 'rider', 'head', 'background', 'sign', 'hedge', 'sidewalk', 'pipe', 'plate', 'light', 'fender', 'street', 'back', 'track', 'red', 'mirror']
2022-03-16 16:57:47,568.568 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'number', 'line', 'road', 'red', 'rock', 'race', 'person', 'tree', 'background', 'chain', 'truck', 'suit', 'flag', 'wheel', 'bush', 'bottle', 'bike', 'motorcycle', 'boot', 'sleeve', 'helmet', 'cart', 'tire', 'curb', 'glove']
2022-03-16 17:00:11,350.350 2829:trainer.py:487 do_train_dict(): eta: 18:46:07  iter: 26200  speed: 294.7 images/sec  total_norm: 138.7981 (140.1107)  loss: 149.5854 (149.4349)  masked_loss: 1.5622 (1.5844)  tag_loss: 147.8237 (147.8505)  time: 1.4347 (1.7374)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4296 (1.7323)  save_time: 8.8805 (21.7526)  lr: 0.000061  max mem: 26307
2022-03-16 17:00:11,711.711 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-16 17:00:11,711.711 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 138.23646545410156
2022-03-16 17:00:11,711.711 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.37585929319432
2022-03-16 17:00:25,454.454 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019185159355401993
2022-03-16 17:00:25,454.454 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:00:25,455.455 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'group', '[MASK]', 'children', 'standing', 'around', 'a', 'candle', 'filled', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:00:25,470.470 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cake', 'shirt', 'candle', 'hair', 'child', 'birthday', 'boy', 'girl', 'table', 'head', 'hand', 'person', 'tray', '[UNK]', 'eye', 'cup', 'plate', 'woman', 'face', 'flame', 'crown', 'hat', 'nose', 'little', 'picture', 'box', 'writing', 'kid', 'chair', 'wall', 'young', 'front', 'sweater', 'arm', 'floor', 'paper', 'man', 'necklace', 'bench', 'container', 'cardboard', 'dinosaur', 'letter', 'word', 'handle', 'ball', 'window', 'small', 'cream', 'group']
2022-03-16 17:00:41,423.423 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'group', 'hand', 'woman', 'cup', 'hair', 'girl', 'person', 'child', 'table', 'boy', 'writing', 'eye', 'shirt', 'crown', 'nose', 'plate', 'bench', 'cake', 'tray', 'candle', 'sweater', 'dinosaur']
2022-03-16 17:03:05,020.020 2829:trainer.py:487 do_train_dict(): eta: 18:43:27  iter: 26300  speed: 294.8 images/sec  total_norm: 137.6970 (139.7932)  loss: 152.1063 (149.5219)  masked_loss: 1.5213 (1.5976)  tag_loss: 150.5312 (147.9243)  time: 1.4333 (1.7366)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4281 (1.7314)  save_time: 8.8805 (21.7526)  lr: 0.000060  max mem: 26307
2022-03-16 17:03:05,381.381 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5714285969734192
2022-03-16 17:03:05,382.382 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 123.57514953613281
2022-03-16 17:03:05,382.382 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.38732507012107
2022-03-16 17:03:19,136.136 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019203023985028267
2022-03-16 17:03:19,137.137 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:03:19,137.137 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'with', 'an', 'orange', 'shirt', 'hu', '##rl', '##s', '[MASK]', 'fr', '[MASK]', '##bee', 'in', '[MASK]', 'of', 'him', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:03:19,152.152 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'short', 'tree', 'ground', 'shoe', 'man', 'shadow', 'arm', 'bush', '[UNK]', 'grass', 'head', 'leg', 'hand', 'sock', 'hair', 'trunk', 'plant', 'hat', 'boy', 'person', 'sky', 'wood', 'dirt', 'ear', 'game', 'young', 'path', 'cap', 'watch', 'pole', 'face', 'field', 'glasses', 'foot', 'park', 'sunglasses', 'disc', 'logo', 'orange', 'wrist', 'playing', 'stump', 'leaf', 'area', 'woman', 'trail', 'air', 'cloud', 'stripe']
2022-03-16 17:03:35,100.100 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'front', 'short', 'ground', 'hair', 'arm', 'tree', 'wood', 'shirt', 'orange', 'shadow', 'grass', 'bush', 'dirt', 'glasses', 'shoe', 'stump', 'sock']
2022-03-16 17:05:58,746.746 2829:trainer.py:487 do_train_dict(): eta: 18:40:48  iter: 26400  speed: 294.7 images/sec  total_norm: 140.1110 (145.5084)  loss: 147.5887 (148.3997)  masked_loss: 1.6384 (1.6942)  tag_loss: 145.1637 (146.7055)  time: 1.4338 (1.7373)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0050)  time_gpu: 1.4285 (1.7321)  save_time: 8.8805 (21.7526)  lr: 0.000060  max mem: 26307
2022-03-16 17:05:59,109.109 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-16 17:05:59,109.109 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 127.11598205566406
2022-03-16 17:05:59,109.109 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.39912128088609
2022-03-16 17:06:12,785.785 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019192570820450783
2022-03-16 17:06:12,785.785 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:06:12,786.786 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'lady', 'with', 'a', 'red', '[MASK]', ',', '##⇌', 'pants', ',', 'and', 'a', 'clear', 'umbrella', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:06:12,801.801 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['umbrella', '[UNK]', 'sidewalk', 'window', 'person', 'bus', 'jacket', 'pole', 'road', 'ground', 'door', 'street', 'shoe', 'wall', 'line', 'tree', 'tire', 'woman', 'wheel', 'curb', 'sky', 'bush', 'sign', 'coat', 'building', 'hand', 'reflection', 'car', 'leg', 'fence', 'light', 'bag', 'stripe', 'fire', 'rain', 'purse', 'jean', 'man', 'handle', 'plant', 'boy', 'red', 'child', 'rainy', 'background', 'wet', 'water', 'hair', 'foot', 'grass']
2022-03-16 17:06:28,691.691 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'line', 'black', 'building', 'door', 'road', 'street', 'red', 'light', 'woman', 'ground', 'person', 'wall', 'clear', 'lady', 'plant', 'window', 'tree', 'sign', 'sky', 'bus', 'leg', 'bag', 'coat', 'bush', 'pole', 'jacket', 'fence', 'reflection', 'shoe', 'sidewalk', 'tire', 'umbrella', 'curb', 'strap']
2022-03-16 17:08:52,610.610 2829:trainer.py:487 do_train_dict(): eta: 18:38:09  iter: 26500  speed: 294.5 images/sec  total_norm: 142.1496 (144.8692)  loss: 145.1857 (146.8584)  masked_loss: 1.5447 (1.5827)  tag_loss: 143.3026 (145.2757)  time: 1.4334 (1.7386)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4283 (1.7334)  save_time: 8.8805 (21.7526)  lr: 0.000060  max mem: 26307
2022-03-16 17:08:52,971.971 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6129032373428345
2022-03-16 17:08:52,971.971 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 142.62860107421875
2022-03-16 17:08:52,971.971 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.39220493001149
2022-03-16 17:09:06,768.768 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01921571046113968
2022-03-16 17:09:06,769.769 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:09:06,769.769 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'dog', 'is', 'on', 'a', 'leash', 'outside', 'of', 'church', 'doors', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:09:06,784.784 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'door', 'building', 'window', 'brick', '[UNK]', 'floor', 'handle', 'sign', 'light', 'glass', 'paper', 'head', 'cat', 'frame', 'hand', 'number', 'tile', 'box', 'pipe', 'pole', 'letter', 'face', 'step', 'ceiling', 'tree', 'top', 'stone', 'flower', 'ear', 'leg', 'plant', 'panel', 'ground', 'word', 'reflection', 'room', 'shelf', 'white', 'front', 'bear', 'picture', 'large', 'clock', 'block', 'ledge', 'doorway', 'curtain', 'phone', 'shutter']
2022-03-16 17:09:22,667.667 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'church', 'building', 'door', 'ground', 'outside', 'wall', 'paper', 'window', 'sign', 'dog', 'pole', 'doorway', 'arch', 'collar', 'sidewalk', 'leash']
2022-03-16 17:11:46,492.492 2829:trainer.py:487 do_train_dict(): eta: 18:35:29  iter: 26600  speed: 294.5 images/sec  total_norm: 138.2753 (141.2103)  loss: 149.3650 (152.5968)  masked_loss: 1.5621 (1.6054)  tag_loss: 147.9222 (150.9914)  time: 1.4344 (1.7388)  data: 0.0001 (0.0005)  to_device: 0.0052 (0.0051)  time_gpu: 1.4290 (1.7332)  save_time: 8.8805 (21.7526)  lr: 0.000060  max mem: 26307
2022-03-16 17:11:46,854.854 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 17:11:46,855.855 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 172.50955200195312
2022-03-16 17:11:46,855.855 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.39589138245314
2022-03-16 17:12:00,728.728 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01919550448656082
2022-03-16 17:12:00,728.728 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:12:00,729.729 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'person', 'who', 'is', 'standing', 'on', 'the', '[MASK]', 'and', 'flying', 'a', 'kite', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:12:00,744.744 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'kite', '[UNK]', 'leg', 'water', 'string', 'person', 'hair', 'woman', 'handle', 'man', 'hand', 'wave', 'beach', 'head', 'shirt', 'foot', 'ocean', 'rope', 'board', 'girl', 'short', 'suit', 'mountain', 'top', 'line', 'arm', 'grass', 'rock', 'jean', 'shadow', 'sand', 'tree', 'shoe', 'air', 'belt', 'sail', 'cloud', 'parachute', 'surfer', 'shore', 'horizon', 'wet', 'building', 'pole', 'distance', 'wake', 'flag', 'boot', 'boat']
2022-03-16 17:12:16,710.710 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'line', 'water', 'top', 'woman', 'short', 'rock', 'board', 'hair', 'person', 'foot', 'beach', 'sky', 'shirt', 'ocean', 'leg', 'wave', 'string', 'horizon', 'kite']
03-16 17:12:48.353 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 17:12:48.353 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 17:12:49.390 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 94}]
2022-03-16 17:14:40,445.445 2829:trainer.py:487 do_train_dict(): eta: 18:32:50  iter: 26700  speed: 294.3 images/sec  total_norm: 139.1263 (141.1770)  loss: 146.5380 (147.8299)  masked_loss: 1.5377 (1.5350)  tag_loss: 144.9714 (146.2949)  time: 1.4334 (1.7395)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4283 (1.7342)  save_time: 8.8805 (21.7526)  lr: 0.000060  max mem: 26307
2022-03-16 17:14:40,807.807 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.48571428656578064
2022-03-16 17:14:40,808.808 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.53370666503906
2022-03-16 17:14:40,808.808 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.4103467713541
2022-03-16 17:14:54,775.775 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019228482618927956
2022-03-16 17:14:54,775.775 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:14:54,776.776 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'young', 'boy', 'sitting', '[MASK]', 'a', 'blue', 'toy', 'tractor', 'with', 'sheep', 'next', '[MASK]', 'him', '[MASK]', 'the', 'shadows', 'of', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:14:54,791.791 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shadow', 'ground', 'hair', 'boy', 'hand', 'head', 'wheel', 'ear', 'leg', 'sheep', '[UNK]', 'pig', 'baby', 'child', 'nose', 'tire', 'face', 'foot', 'arm', 'dirt', 'shirt', 'animal', 'mouth', 'shoe', 'toy', 'little', 'tractor', 'fence', 'vehicle', 'eye', 'young', 'small', 'back', 'post', 'truck', 'tail', 'pole', 'blue', 'handle', 'cart', 'jean', 'top', 'grass', 'rock', 'man', 'road', 'trunk', 'person', 'wool', 'wood']
2022-03-16 17:15:10,821.821 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'next', 'young', 'ground', 'hair', 'blue', 'child', 'arm', 'boy', 'shirt', 'animal', 'leg', 'vehicle', 'nose', 'ear', 'shadow', 'wheel', 'dirt', 'sheep', 'toy', 'straw', 'tractor']
2022-03-16 17:17:34,311.311 2829:trainer.py:487 do_train_dict(): eta: 18:30:10  iter: 26800  speed: 294.5 images/sec  total_norm: 136.9151 (139.9857)  loss: 147.9747 (147.2288)  masked_loss: 1.5170 (1.5567)  tag_loss: 146.1845 (145.6721)  time: 1.4322 (1.7388)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4271 (1.7336)  save_time: 8.8805 (21.7526)  lr: 0.000060  max mem: 26307
2022-03-16 17:17:34,673.673 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 17:17:34,673.673 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.01046752929688
2022-03-16 17:17:34,674.674 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.40696632640513
2022-03-16 17:17:48,717.717 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019247330725193024
2022-03-16 17:17:48,717.717 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:17:48,717.717 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'train', 'that', 'is', 'covered', '[MASK]', 'on', 'the', 'tracks', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:17:48,733.733 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['train', 'track', 'wheel', 'tree', 'door', 'sky', 'window', 'car', 'roof', 'gravel', '[UNK]', 'ground', 'building', 'pole', 'stripe', 'line', 'grass', 'bumper', 'cover', 'light', 'fence', 'top', 'plastic', 'handle', 'container', 'red', 'pipe', 'box', 'wall', 'front', 'background', 'paint', 'sign', 'vent', 'platform', 'post', 'old', 'bush', 'house', 'windshield', 'ladder', 'wire', 'engine', 'tank', 'trailer', 'person', 'hose', 'writing', 'tire', 'yard']
2022-03-16 17:18:04,679.679 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'house', 'building', 'door', 'car', 'track', 'post', 'window', 'train', 'tree', 'sky', 'roof', 'wheel', 'pipe', 'lamp', 'gravel', 'container']
2022-03-16 17:20:28,287.287 2829:trainer.py:487 do_train_dict(): eta: 18:27:31  iter: 26900  speed: 294.3 images/sec  total_norm: 138.2182 (142.0513)  loss: 147.9337 (150.1810)  masked_loss: 1.5444 (1.5876)  tag_loss: 146.2841 (148.5934)  time: 1.4331 (1.7397)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4280 (1.7345)  save_time: 8.8805 (21.7526)  lr: 0.000059  max mem: 26307
2022-03-16 17:20:28,648.648 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6285714507102966
2022-03-16 17:20:28,648.648 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 132.73724365234375
2022-03-16 17:20:28,648.648 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.40635713647913
2022-03-16 17:20:42,578.578 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019264230504631996
2022-03-16 17:20:42,578.578 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:20:42,579.579 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'sign', 'at', 'the', 'corner', 'of', 'two', 'streets', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:20:42,594.594 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'tree', 'building', 'pole', 'sign', 'wire', 'roof', 'door', 'street', '[UNK]', 'light', 'road', 'power', 'window', 'bush', 'line', 'stop', 'car', 'tire', 'chimney', 'letter', 'clock', 'wall', 'grass', 'sidewalk', 'garage', 'house', 'traffic', 'parking', 'post', 'fence', 'front', 'truck', 'telephone', 'shadow', 'number', 'arrow', 'flag', 'suv', 'fire', 'tower', 'man', 'leaf', 'lot', 'flower', 'lamp', 'church', 'wheel', 'curb', 'white']
2022-03-16 17:20:58,446.446 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['house', 'line', 'building', 'power', 'street', 'light', 'car', 'stop', 'mountain', 'tree', 'corner', 'letter', 'sign', 'sky', 'pole', 'telephone', 'chimney', 'graffiti']
2022-03-16 17:23:22,333.333 2829:trainer.py:487 do_train_dict(): eta: 18:24:51  iter: 27000  speed: 294.2 images/sec  total_norm: 137.8419 (140.2704)  loss: 145.6339 (148.3549)  masked_loss: 1.5770 (1.5956)  tag_loss: 143.7361 (146.7593)  time: 1.4351 (1.7405)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4298 (1.7353)  save_time: 8.8805 (21.7526)  lr: 0.000059  max mem: 26307
2022-03-16 17:23:22,697.697 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 17:23:22,697.697 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 130.75640869140625
2022-03-16 17:23:22,697.697 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.4195082970651
2022-03-16 17:23:36,794.794 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01930582895874977
2022-03-16 17:23:36,794.794 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:23:36,794.794 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', 'older', 'man', 'in', 'a', 'wind', '##breaker', 'standing', 'by', '[MASK]', 'street', 'while', '[MASK]', 'horse', 'walks', 'by', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:23:36,810.810 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'sky', 'tree', 'jacket', 'car', 'street', 'roof', 'window', 'shoe', 'road', 'building', 'short', '[UNK]', 'fence', 'head', 'hat', 'shirt', 'wagon', 'carriage', 'cart', 'horse', 'leg', 'wheel', 'person', 'sock', 'face', 'beard', 'pole', 'sunglasses', 'house', 'cap', 'hair', 'sidewalk', 'windshield', 'wall', 'sign', 'tire', 'harness', 'coat', 'neck', 'hand', 'paper', 'ground', 'glasses', 'nose', 'line', 'brick', 'railing', 'drawn', 'chimney']
2022-03-16 17:23:52,804.804 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'house', 'hand', 'building', 'road', 'street', 'short', 'car', 'hair', 'person', 'wall', 'standing', 'paper', 'window', 'tree', 'horse', 'sky', 'shirt', 'roof', 'gate', 'wheel', 'hat', 'cap', 'pole', 'jacket', 'walks', 'fence', 'carriage', 'wagon', 'beard', 'shoe', 'cart', 'tire', 'sunglasses', 'harness', 'windshield', 'sock']
2022-03-16 17:26:16,523.523 2829:trainer.py:487 do_train_dict(): eta: 18:22:12  iter: 27100  speed: 293.9 images/sec  total_norm: 140.7691 (142.9792)  loss: 147.4709 (147.8035)  masked_loss: 1.4918 (1.5132)  tag_loss: 145.9364 (146.2903)  time: 1.4334 (1.7419)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4283 (1.7367)  save_time: 8.8805 (21.7526)  lr: 0.000059  max mem: 26307
2022-03-16 17:26:16,883.883 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-16 17:26:16,884.884 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 152.34548950195312
2022-03-16 17:26:16,884.884 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.41437843266655
2022-03-16 17:26:31,045.045 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019328787922859192
2022-03-16 17:26:31,045.045 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:26:31,046.046 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'skier', 'kicks', '[MASK]', '[MASK]', 'while', 'riding', 'through', 'heavy', 'white', 'snow', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:26:31,061.061 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['snow', 'shadow', '[UNK]', 'person', 'jacket', 'man', 'ground', 'mountain', 'sky', 'ski', 'pole', 'head', 'hill', 'coat', 'skier', 'arm', 'glove', 'cloud', 'hat', 'leg', 'slope', 'rock', 'track', 'tree', 'snowy', 'hand', 'helmet', 'foot', 'backpack', 'boot', 'board', 'face', 'sign', 'background', 'top', 'shirt', 'woman', 'building', 'fence', 'hair', 'side', 'cap', 'sun', 'sunglasses', 'group', 'lift', 'steep', 'line', 'wire', 'day']
2022-03-16 17:26:46,984.984 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'white', 'ground', 'board', 'person', 'hill', 'heavy', 'mountain', 'sky', 'leg', 'snow', 'shadow', 'cloud', 'pole', 'jacket', 'helmet', 'glove']
2022-03-16 17:29:10,576.576 2829:trainer.py:487 do_train_dict(): eta: 18:19:32  iter: 27200  speed: 294.2 images/sec  total_norm: 142.0314 (143.9162)  loss: 149.3119 (151.0374)  masked_loss: 1.5927 (1.6276)  tag_loss: 147.6635 (149.4098)  time: 1.4338 (1.7405)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4285 (1.7354)  save_time: 8.8805 (21.7526)  lr: 0.000059  max mem: 26307
2022-03-16 17:29:10,938.938 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 17:29:10,938.938 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 151.21751403808594
2022-03-16 17:29:10,938.938 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.4192025460603
2022-03-16 17:29:25,119.119 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019347792491316795
2022-03-16 17:29:25,119.119 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:29:25,119.119 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'men', 'in', 'front', 'of', '[MASK]', 'parking', '[MASK]', 'with', 'an', 'official', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:29:25,135.135 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['meter', 'car', 'sidewalk', 'tire', 'parking', 'street', 'pole', 'road', 'license', '[UNK]', 'shirt', 'plate', 'van', 'curb', 'tree', 'man', 'door', 'head', 'building', 'vehicle', 'light', 'window', 'hair', 'person', 'suv', 'hand', 'wheel', 'sign', 'truck', 'arm', 'line', 'leaf', 'tail', 'trunk', 'shoe', 'leg', 'windshield', 'short', 'bag', 'bumper', 'hat', 'mirror', 'woman', 'next', 'brick', 'handle', 'phone', 'sky', 'side', 'skirt']
2022-03-16 17:29:41,124.124 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'building', 'road', 'front', 'street', 'light', 'short', 'car', 'hair', 'wall', 'official', 'van', 'paper', 'tree', 'sign', 'shirt', 'traffic', 'vehicle', 'roof', 'plate', 'parking', 'hat', 'license', 'cap', 'pole', 'glasses', 'meter', 'fence', 'shoe', 'flip', 'sidewalk', 'drain', 'tire', 'curb', 'flop']
2022-03-16 17:32:04,830.830 2829:trainer.py:487 do_train_dict(): eta: 18:16:52  iter: 27300  speed: 293.8 images/sec  total_norm: 138.5890 (141.6856)  loss: 145.5451 (147.0515)  masked_loss: 1.5990 (1.5987)  tag_loss: 143.6497 (145.4527)  time: 1.4339 (1.7425)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4289 (1.7374)  save_time: 8.8805 (21.7526)  lr: 0.000059  max mem: 26307
2022-03-16 17:32:05,191.191 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-16 17:32:05,191.191 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.53842163085938
2022-03-16 17:32:05,192.192 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.41587903377784
2022-03-16 17:32:19,416.416 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019326167181134224
2022-03-16 17:32:19,416.416 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:32:19,417.417 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'desk', '[MASK]', '[MASK]', 'computer', 'and', 'a', 'tv', 'inside', 'of', 'a', 'room', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:32:19,432.432 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'leg', 'window', 'keyboard', 'screen', 'computer', 'floor', 'laptop', 'desk', 'table', 'mouse', 'monitor', 'speaker', 'room', 'pen', 'lamp', 'picture', '[UNK]', 'cup', 'pad', 'cord', 'chair', 'phone', 'stand', 'office', 'ball', 'outlet', 'top', 'box', 'television', 'base', 'shelf', 'pencil', 'door', 'light', 'toy', 'bowl', 'handle', 'remote', 'cell', 'ipod', 'camera', 'book', 'icon', 'mug', 'holder', 'poster', 'bottle', 'desktop', 'can']
2022-03-16 17:32:35,375.375 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['room', 'cup', 'inside', 'tv', 'floor', 'table', 'wall', 'stand', 'computer', 'window', 'ball', 'picture', 'screen', 'leg', 'desk', 'clock', 'shadow', 'speaker', 'ceiling', 'switch', 'pen', 'mouse', 'monitor', 'alarm', 'keyboard', 'lamp', 'laptop', 'drawer', 'dresser']
2022-03-16 17:34:59,132.132 2829:trainer.py:487 do_train_dict(): eta: 18:14:13  iter: 27400  speed: 293.7 images/sec  total_norm: 137.9875 (139.2648)  loss: 143.8305 (145.4532)  masked_loss: 1.6015 (1.6724)  tag_loss: 142.6362 (143.7808)  time: 1.4342 (1.7431)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0050)  time_gpu: 1.4289 (1.7379)  save_time: 8.8805 (21.7526)  lr: 0.000059  max mem: 26307
2022-03-16 17:34:59,493.493 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-16 17:34:59,493.493 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 118.86441802978516
2022-03-16 17:34:59,493.493 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.43792420820756
2022-03-16 17:35:13,577.577 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019366972148418427
2022-03-16 17:35:13,577.577 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:35:13,578.578 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'and', 'his', 'owner', 'in', 'the', 'kitchen', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:35:13,593.593 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['dog', 'floor', 'eye', 'head', 'ear', 'cabinet', 'leg', 'person', 'refrigerator', 'rug', 'door', 'foot', 'nose', '[UNK]', 'paw', 'collar', 'drawer', 'man', 'neck', 'sock', 'kitchen', 'handle', 'wall', 'shoe', 'face', 'food', 'tail', 'mat', 'hand', 'cord', 'hair', 'stove', 'knob', 'white', 'brown', 'carpet', 'bottle', 'someone', 'cat', 'next', 'jean', 'wire', 'shirt', 'oven', 'paper', 'bag', 'something', 'open', 'top', 'small']
2022-03-16 17:35:29,468.468 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'door', 'person', 'floor', 'wall', 'eye', 'neck', 'foot', 'kitchen', 'dog', 'owner', 'leg', 'nose', 'ear', 'handle', 'cabinet', 'collar', 'shoe', 'drawer', 'refrigerator', 'rug', 'paw']
2022-03-16 17:37:53,499.499 2829:trainer.py:487 do_train_dict(): eta: 18:11:33  iter: 27500  speed: 293.6 images/sec  total_norm: 138.7480 (142.2419)  loss: 148.2450 (148.0545)  masked_loss: 1.6379 (1.6286)  tag_loss: 146.5880 (146.4260)  time: 1.4346 (1.7436)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4294 (1.7385)  save_time: 8.8805 (21.7526)  lr: 0.000059  max mem: 26307
2022-03-16 17:37:53,859.859 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-16 17:37:53,859.859 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 155.71047973632812
2022-03-16 17:37:53,859.859 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.4278347319451
2022-03-16 17:38:08,100.100 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019369781017303467
2022-03-16 17:38:08,100.100 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:38:08,101.101 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'is', 'standing', 'on', 'top', 'of', 'a', 'large', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:38:08,116.116 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'cloud', 'rock', 'head', 'sheep', 'face', 'leg', 'grass', 'moss', 'boulder', 'hill', 'wool', 'mountain', 'tree', 'cliff', 'top', 'goat', 'fur', 'rocky', 'ear', '[UNK]', 'blue', 'body', 'bush', 'ground', 'standing', 'animal', 'horn', 'stone', 'side', 'nose', 'large', 'branch', 'next', 'plant', 'hillside', 'ram', 'couple', 'cloudy', 'foot', 'mouth', 'white', 'day', 'area', 'green', 'wall', 'tail', 'group', 'black', 'steep']
2022-03-16 17:38:24,038.038 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'face', 'large', 'top', 'rock', 'hill', 'mountain', 'sky', 'leg', 'grass', 'bush', 'cloud', 'sheep', 'moss', 'wool', 'boulder']
2022-03-16 17:40:47,754.754 2829:trainer.py:487 do_train_dict(): eta: 18:08:53  iter: 27600  speed: 293.8 images/sec  total_norm: 140.4390 (149.8653)  loss: 147.6465 (148.3757)  masked_loss: 1.6717 (1.6804)  tag_loss: 145.7764 (146.6953)  time: 1.4335 (1.7425)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4281 (1.7373)  save_time: 8.8805 (21.7526)  lr: 0.000058  max mem: 26307
2022-03-16 17:40:48,115.115 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7352941036224365
2022-03-16 17:40:48,115.115 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 149.34246826171875
2022-03-16 17:40:48,115.115 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.43072164101721
2022-03-16 17:41:02,389.389 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01937682181596756
2022-03-16 17:41:02,389.389 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:41:02,389.389 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'the', 'window', 'you', 'can', 'see', 'a', 'reflection', 'of', '##木', 'sea', 'and', 'beautiful', 'buildings', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:41:02,405.405 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['water', 'building', 'window', 'sky', 'boat', '[UNK]', 'floor', 'person', 'sign', 'balcony', 'pole', 'roof', 'glass', 'reflection', 'door', 'railing', 'flag', 'wall', 'tree', 'post', 'umbrella', 'table', 'clock', 'chair', 'river', 'frame', 'bridge', 'canopy', 'arch', 'leg', 'tower', 'dock', 'man', 'head', 'sidewalk', 'dome', 'hand', 'shirt', 'top', 'deck', 'front', 'light', 'bench', 'lamp', 'doorway', 'woman', 'ground', 'handle', 'large', 'letter']
2022-03-16 17:41:18,375.375 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['water', 'building', 'sea', 'window', 'beautiful', 'letter', 'sign', 'bus', 'boat', 'mirror', 'dome', 'reflection', 'balcony', 'umbrella', 'stripe']
03-16 17:42:49.489 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 17:42:49.489 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 17:42:50.801 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 17:43:42,162.162 2829:trainer.py:487 do_train_dict(): eta: 18:06:13  iter: 27700  speed: 293.6 images/sec  total_norm: 137.5406 (139.4670)  loss: 147.0752 (146.6626)  masked_loss: 1.5796 (1.6098)  tag_loss: 145.6542 (145.0528)  time: 1.4337 (1.7441)  data: 0.0001 (0.0005)  to_device: 0.0052 (0.0051)  time_gpu: 1.4284 (1.7385)  save_time: 8.8805 (21.7526)  lr: 0.000058  max mem: 26307
2022-03-16 17:43:42,522.522 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5833333134651184
2022-03-16 17:43:42,523.523 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 131.0507354736328
2022-03-16 17:43:42,523.523 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.44000179647541
2022-03-16 17:43:56,943.943 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019360285252332687
2022-03-16 17:43:56,944.944 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:43:56,944.944 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'laptop', 'computer', 'sitting', 'on', '[MASK]', 'of', 'a', '[MASK]', 'desk', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:43:56,959.959 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['computer', 'desk', 'monitor', 'keyboard', 'mouse', 'screen', 'table', 'laptop', 'apple', 'stand', 'wall', 'cord', '[UNK]', 'logo', 'base', 'speaker', 'pad', 'shelf', 'picture', 'phone', 'lamp', 'wire', 'tree', 'box', 'paper', 'book', 'light', 'pen', 'floor', 'desktop', 'plug', 'printer', 'icon', 'television', 'cell', 'office', 'window', 'handle', 'top', 'key', 'frame', 'chair', 'cup', 'glass', 'sign', 'front', 'bottle', 'curtain', 'room', 'tray']
2022-03-16 17:44:12,917.917 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'white', 'top', 'light', 'office', 'table', 'wall', 'stand', 'computer', 'screen', 'desk', 'speaker', 'mouse', 'monitor', 'logo', 'keyboard', 'lamp', 'cord', 'laptop']
2022-03-16 17:46:36,579.579 2829:trainer.py:487 do_train_dict(): eta: 18:03:33  iter: 27800  speed: 293.6 images/sec  total_norm: 140.5508 (144.2280)  loss: 146.3595 (147.4450)  masked_loss: 1.6444 (1.6544)  tag_loss: 144.7416 (145.7906)  time: 1.4335 (1.7442)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4282 (1.7390)  save_time: 8.8805 (21.7526)  lr: 0.000058  max mem: 26307
2022-03-16 17:46:36,940.940 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.625
2022-03-16 17:46:36,940.940 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 165.9900665283203
2022-03-16 17:46:36,940.940 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.43810798617675
2022-03-16 17:46:51,473.473 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01936018280684948
2022-03-16 17:46:51,473.473 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:46:51,474.474 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'yellow', 'plane', 'prepares', 'to', '[MASK]', 'off', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:46:51,489.489 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'airplane', 'wheel', 'wing', 'grass', 'cloud', 'tail', 'cockpit', 'propeller', 'tree', 'ground', 'field', 'gear', 'shadow', 'runway', 'pilot', 'star', 'nose', 'yellow', 'landing', 'man', 'person', 'number', 'window', 'logo', 'bush', 'letter', 'small', 'plane', 'engine', '[UNK]', 'tire', 'building', 'front', 'blade', 'aircraft', 'blue', 'road', 'hedge', 'circle', 'fighter', 'cross', 'windshield', 'top', 'stripe', 'shirt', 'fence', 'white', 'jet', 'flag']
2022-03-16 17:47:07,386.386 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['man', 'road', 'field', 'ground', 'star', 'cross', 'window', 'wing', 'tree', 'letter', 'sky', 'yellow', 'pilot', 'nose', 'tiny', 'plane', 'wheel', 'grass', 'tail', 'bush', 'cloud', 'runway', 'airplane', 'cockpit', 'propeller', 'hedge']
2022-03-16 17:49:31,069.069 2829:trainer.py:487 do_train_dict(): eta: 18:00:53  iter: 27900  speed: 293.4 images/sec  total_norm: 138.8524 (141.7900)  loss: 149.2848 (149.3548)  masked_loss: 1.5838 (1.5781)  tag_loss: 147.7480 (147.7767)  time: 1.4349 (1.7449)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4296 (1.7397)  save_time: 8.8805 (21.7526)  lr: 0.000058  max mem: 26307
2022-03-16 17:49:31,432.432 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-16 17:49:31,433.433 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 159.7484130859375
2022-03-16 17:49:31,433.433 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.43593192781721
2022-03-16 17:49:46,010.010 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019337672740221024
2022-03-16 17:49:46,010.010 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:49:46,010.010 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'brown', 'and', 'white', 'cow', 'drinking', 'from', 'black', 'sp', '##out', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:49:46,025.025 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['mouth', 'nose', 'eye', 'head', 'cow', 'ear', 'face', 'grass', 'horn', 'neck', '[UNK]', 'collar', 'spot', 'plant', 'fence', 'ground', 'wall', 'tree', 'leg', 'weed', 'dirt', 'water', 'chin', 'tag', 'tongue', 'white', 'hair', 'rope', 'bush', 'field', 'rock', 'harness', 'buckle', 'snout', 'bell', 'flower', 'leaf', 'brown', 'hay', 'tail', 'patch', 'number', 'next', 'fur', 'pole', 'black', 'lip', 'close', 'chain', 'metal']
2022-03-16 17:50:01,940.940 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'number', 'face', 'black', 'white', 'ground', 'mouth', 'eye', 'neck', 'spot', 'tongue', 'nose', 'ear', 'shadow', 'lip', 'grass', 'drinking', 'tag', 'horn', 'collar', 'cow', 'weed', 'hose', 'buckle']
2022-03-16 17:52:25,466.466 2829:trainer.py:487 do_train_dict(): eta: 17:58:13  iter: 28000  speed: 293.6 images/sec  total_norm: 141.1687 (143.0833)  loss: 146.3469 (146.2297)  masked_loss: 1.5239 (1.5871)  tag_loss: 144.7503 (144.6426)  time: 1.4330 (1.7440)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4278 (1.7387)  save_time: 8.8805 (21.7526)  lr: 0.000058  max mem: 26307
2022-03-16 17:52:25,829.829 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 17:52:25,829.829 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.07962036132812
2022-03-16 17:52:25,829.829 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.43938268206722
2022-03-16 17:52:40,292.292 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019387340173125267
2022-03-16 17:52:40,293.293 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:52:40,293.293 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'zebra', 'eating', 'grass', 'behind', 'a', 'fence', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:52:40,309.309 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['fence', 'building', 'wall', 'zebra', 'leg', 'post', 'tree', 'door', 'bush', 'trunk', 'ground', 'pole', 'leaf', 'plant', 'zoo', 'window', 'head', '[UNK]', 'enclosure', 'branch', 'dirt', 'tail', 'mane', 'stripe', 'ear', 'brick', 'wire', 'grass', 'flower', 'log', 'next', 'mouth', 'rock', 'pillar', 'front', 'neck', 'enclosed', 'garden', 'pen', 'gate', 'area', 'hair', 'blade', 'light', 'house', 'sign', 'nose', 'wood', 'box', 'standing']
2022-03-16 17:52:56,245.245 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['building', 'door', 'ground', 'post', 'wall', 'plant', 'window', 'tree', 'leg', 'brick', 'grass', 'blade', 'pole', 'dirt', 'leaf', 'trunk', 'fence', 'zoo', 'zebra']
2022-03-16 17:55:20,109.109 2829:trainer.py:487 do_train_dict(): eta: 17:55:33  iter: 28100  speed: 293.2 images/sec  total_norm: 138.6513 (143.0182)  loss: 146.6214 (146.7393)  masked_loss: 1.5493 (1.5936)  tag_loss: 144.5963 (145.1458)  time: 1.4345 (1.7464)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4291 (1.7412)  save_time: 8.8805 (21.7526)  lr: 0.000058  max mem: 26307
2022-03-16 17:55:20,470.470 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-16 17:55:20,471.471 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 127.68038177490234
2022-03-16 17:55:20,471.471 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.43796056382199
2022-03-16 17:55:35,057.057 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019437750801444054
2022-03-16 17:55:35,058.058 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:55:35,058.058 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', '[MASK]', 'with', 'luggage', 'at', 'an', 'airport', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:55:35,073.073 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'sign', 'man', 'floor', 'person', 'woman', 'short', 'airport', 'cart', 'ceiling', '[UNK]', 'wall', 'bag', 'hat', 'light', 'number', 'wheel', 'television', 'luggage', 'backpack', 'arrow', 'suitcase', 'building', 'chair', 'head', 'lady', 'shoe', 'flop', 'hand', 'cap', 'group', 'hair', 'foot', 'skirt', 'clock', 'jean', 'letter', 'arm', 'screen', 'line', 'glasses', 'boy', 'wheelchair', 'leg', 'girl', 'door', 'pillar', 'jacket', 'window', 'pole']
2022-03-16 17:55:51,044.044 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'group', 'hand', 'number', 'line', 'light', 'woman', 'short', 'television', 'hair', 'person', 'floor', 'wall', 'date', 'airport', 'lady', 'chair', 'box', 'sign', 'shirt', 'bag', 'wheel', 'ceiling', 'column', 'hat', 'cap', 'arrow', 'shoe', 'cart', 'backpack', 'pillar', 'luggage']
2022-03-16 17:58:14,627.627 2829:trainer.py:487 do_train_dict(): eta: 17:52:53  iter: 28200  speed: 293.4 images/sec  total_norm: 141.2034 (143.5171)  loss: 143.7106 (147.1741)  masked_loss: 1.5716 (1.6050)  tag_loss: 142.2991 (145.5690)  time: 1.4339 (1.7452)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4286 (1.7400)  save_time: 8.8805 (21.7526)  lr: 0.000058  max mem: 26307
2022-03-16 17:58:14,987.987 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6285714507102966
2022-03-16 17:58:14,987.987 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 133.5584716796875
2022-03-16 17:58:14,988.988 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.44672195650243
2022-03-16 17:58:29,673.673 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019422173500061035
2022-03-16 17:58:29,673.673 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 17:58:29,674.674 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'boat', 'is', 'in', 'the', 'lake', ',', 'one', 'is', 'red', 'and', '[MASK]', 'and', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 17:58:29,689.689 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['water', 'boat', 'tree', 'building', 'roof', 'window', '[UNK]', 'wall', 'house', 'windshield', 'grass', 'moss', 'rock', 'shore', 'cabin', 'river', 'hill', 'small', 'chimney', 'shed', 'sky', 'car', 'ground', 'door', 'sign', 'forest', 'light', 'top', 'person', 'bank', 'reflection', 'ball', 'fence', 'puddle', 'mud', 'bush', 'number', 'rope', 'dock', 'branch', 'plant', 'white', 'old', 'front', 'wave', 'blue', 'wood', 'tire', 'next', 'body']
2022-03-16 17:58:45,656.656 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'house', 'water', 'building', 'white', 'door', 'red', 'car', 'blue', 'post', 'lake', 'wall', 'hill', 'window', 'tree', 'boat', 'roof', 'grass', 'cone', 'chimney', 'windshield', 'puddle']
2022-03-16 18:01:09,314.314 2829:trainer.py:487 do_train_dict(): eta: 17:50:13  iter: 28300  speed: 293.1 images/sec  total_norm: 142.0648 (145.7118)  loss: 147.4682 (149.7926)  masked_loss: 1.5994 (1.6333)  tag_loss: 145.7122 (148.1592)  time: 1.4338 (1.7469)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4288 (1.7419)  save_time: 8.8805 (21.7526)  lr: 0.000057  max mem: 26307
2022-03-16 18:01:09,678.678 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 18:01:09,678.678 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 132.0526580810547
2022-03-16 18:01:09,678.678 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.45548502156433
2022-03-16 18:01:24,317.317 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01942465826869011
2022-03-16 18:01:24,318.318 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:01:24,318.318 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', 'cows', 'walking', 'down', 'areas', 'road', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:01:24,333.333 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['truck', 'cow', 'pole', 'ground', 'rock', 'windshield', 'mountain', '[UNK]', 'tire', 'grass', 'goat', 'road', 'post', 'cab', 'wheel', 'dirt', 'animal', 'flag', 'man', 'wall', 'door', 'leg', 'sign', 'cover', 'front', 'trailer', 'herd', 'fence', 'mirror', 'tail', 'shirt', 'sheep', 'license', 'bull', 'block', 'person', 'hill', 'light', 'window', 'tree', 'van', 'horse', 'group', 'sky', 'banner', 'calf', 'cattle', 'bumper', 'number', 'head']
2022-03-16 18:01:40,237.237 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'road', 'ground', 'rock', 'post', 'star', 'cover', 'mountain', 'letter', 'animal', 'truck', 'flag', 'grass', 'bush', 'pole', 'dirt', 'logo', 'sheep', 'fence', 'cab', 'cow', 'tire', 'cone', 'goat', 'herd', 'tractor', 'windshield']
2022-03-16 18:04:04,388.388 2829:trainer.py:487 do_train_dict(): eta: 17:47:34  iter: 28400  speed: 292.5 images/sec  total_norm: 141.2142 (142.4258)  loss: 149.6705 (148.8117)  masked_loss: 1.5869 (1.6020)  tag_loss: 148.1572 (147.2096)  time: 1.4349 (1.7507)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4299 (1.7455)  save_time: 8.8805 (21.7526)  lr: 0.000057  max mem: 26307
2022-03-16 18:04:04,749.749 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7297297120094299
2022-03-16 18:04:04,749.749 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 143.73825073242188
2022-03-16 18:04:04,750.750 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.45856715419836
2022-03-16 18:04:19,544.544 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019467096775770187
2022-03-16 18:04:19,544.544 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:04:19,544.544 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'carrot', '##s', 'hang', 'tied', 'together', 'on', 'a', 'pole', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:04:19,559.559 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['leaf', 'carrot', '[UNK]', 'vegetable', 'stem', 'bunch', 'top', 'table', 'pile', 'other', 'ground', 'plant', 'large', 'fresh', 'orange', 'wall', 'cloth', 'person', 'plastic', 'full', 'hand', 'ring', 'close', 'light', 'basket', 'different', 'banana', 'paper', 'next', 'flower', 'head', 'background', 'various', 'white', 'group', 'pepper', 'green', 'tag', 'garden', 'sign', 'tree', 'bag', 'red', 'dirt', 'band', 'small', 'market', 'sweet', 'many', 'potato']
2022-03-16 18:04:35,412.412 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'top', 'table', 'pole', 'leaf', 'bunch', 'vegetable', 'carrot']
2022-03-16 18:06:59,115.115 2829:trainer.py:487 do_train_dict(): eta: 17:44:53  iter: 28500  speed: 293.0 images/sec  total_norm: 138.6740 (140.2635)  loss: 146.8426 (146.2962)  masked_loss: 1.5447 (1.5367)  tag_loss: 145.6845 (144.7594)  time: 1.4339 (1.7473)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4286 (1.7421)  save_time: 8.8805 (21.7526)  lr: 0.000057  max mem: 26307
2022-03-16 18:06:59,476.476 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-16 18:06:59,476.476 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 158.02423095703125
2022-03-16 18:06:59,477.477 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.45932896487362
2022-03-16 18:07:14,200.200 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019464680925011635
2022-03-16 18:07:14,201.201 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:07:14,201.201 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'several', 'motorcycles', 'parked', 'in', 'rows', 'on', 'display', 'in', 'a', '##⁄₄', 'lot', '##zi', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:07:14,216.216 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['motorcycle', 'man', 'tire', 'shirt', 'bike', 'person', 'seat', 'jean', '[UNK]', 'wheel', 'street', 'tank', 'helmet', 'short', 'light', 'engine', 'woman', 'ground', 'line', 'road', 'building', 'fender', 'gas', 'tree', 'sign', 'shoe', 'pavement', 'logo', 'hat', 'lot', 'parking', 'car', 'bag', 'windshield', 'hair', 'mirror', 'pipe', 'red', 'pole', 'sidewalk', 'sky', 'dress', 'window', 'next', 'sunglasses', 'row', 'other', 'parked', 'exhaust', 'crowd']
2022-03-16 18:07:30,111.111 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'building', 'road', 'street', 'short', 'hair', 'person', 'seat', 'lot', 'arm', 'tree', 'sign', 'jean', 'shirt', 'gas', 'display', 'tank', 'wheel', 'parking', 'bike', 'pipe', 'motorcycle', 'helmet', 'tire', 'pavement', 'fender', 'windshield']
2022-03-16 18:09:53,963.963 2829:trainer.py:487 do_train_dict(): eta: 17:42:13  iter: 28600  speed: 292.8 images/sec  total_norm: 137.4933 (139.8967)  loss: 145.6241 (147.6192)  masked_loss: 1.5913 (1.6291)  tag_loss: 144.3860 (145.9901)  time: 1.4332 (1.7485)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4281 (1.7433)  save_time: 8.8805 (21.7526)  lr: 0.000057  max mem: 26307
2022-03-16 18:09:54,324.324 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-16 18:09:54,325.325 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.09474182128906
2022-03-16 18:09:54,325.325 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.4683858226816
2022-03-16 18:10:09,157.157 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01946013793349266
2022-03-16 18:10:09,158.158 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:10:09,158.158 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'cat', 'that', '[MASK]', 'looking', 'at', '[MASK]', 'television', 'screen', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:10:09,174.174 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cat', 'wall', 'ear', 'television', 'head', 'screen', 'table', 'logo', 'tail', 'paw', 'curtain', 'bowl', 'stand', 'paper', 'stripe', 'leg', 'light', 'window', 'box', '[UNK]', 'cord', 'lid', 'door', 'button', 'book', 'eye', 'container', 'room', 'reflection', 'nose', 'cover', 'shelf', 'device', 'top', 'toilet', 'tv', 'monitor', 'glass', 'seat', 'black', 'dvd', 'remote', 'cd', 'computer', 'basket', 'base', 'bottle', 'wire', 'tag', 'front']
2022-03-16 18:10:25,155.155 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'book', 'television', 'table', 'wall', 'writing', 'box', 'screen', 'card', 'leg', 'ear', 'bowl', 'cat', 'wire', 'logo', 'cloth', 'curtain', 'cord', 'lid', 'paw']
2022-03-16 18:12:48,639.639 2829:trainer.py:487 do_train_dict(): eta: 17:39:33  iter: 28700  speed: 293.1 images/sec  total_norm: 142.9664 (144.5648)  loss: 148.8179 (149.9991)  masked_loss: 1.6053 (1.6532)  tag_loss: 147.2126 (148.3459)  time: 1.4318 (1.7467)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4265 (1.7415)  save_time: 8.8805 (21.7526)  lr: 0.000057  max mem: 26307
2022-03-16 18:12:49,001.001 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4545454680919647
2022-03-16 18:12:49,001.001 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 173.98463439941406
2022-03-16 18:12:49,001.001 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.45268759462569
03-16 18:12:50.866 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 18:12:50.866 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 18:12:51.558 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}]
2022-03-16 18:13:03,809.809 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019492145627737045
2022-03-16 18:13:03,809.809 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:13:03,809.809 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'several', '[MASK]', 'types', 'of', 'tools', 'are', 'arranged', 'on', 'the', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:13:03,824.824 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['scissors', 'handle', 'table', 'blade', 'cloth', '[UNK]', 'pair', 'net', 'wire', 'mat', 'tape', 'string', 'hole', 'floor', 'green', 'ground', 'cord', 'top', 'line', 'pen', 'band', 'other', 'screw', 'paper', 'cap', 'ball', 'blanket', 'number', 'surface', 'wall', 'next', 'plastic', 'bowl', 'ribbon', 'blue', 'circle', 'eye', 'spot', 'towel', 'logo', 'board', 'red', 'man', 'tool', 'different', 'shadow', 'bolt', 'letter', 'bunch', 'bag']
2022-03-16 18:13:19,711.711 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'band', 'top', 'different', 'table', 'paper', 'metal', 'label', 'bottom', 'spot', 'handle', 'string', 'bottle', 'plastic', 'cap', 'cloth', 'mat', 'banana', 'scissors']
2022-03-16 18:15:43,709.709 2829:trainer.py:487 do_train_dict(): eta: 17:36:53  iter: 28800  speed: 292.5 images/sec  total_norm: 138.5986 (144.3402)  loss: 147.9702 (148.6646)  masked_loss: 1.6064 (1.6054)  tag_loss: 146.7345 (147.0592)  time: 1.4345 (1.7507)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0049)  time_gpu: 1.4292 (1.7453)  save_time: 8.8805 (21.7526)  lr: 0.000057  max mem: 26307
2022-03-16 18:15:44,070.070 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5833333134651184
2022-03-16 18:15:44,071.071 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 142.14830017089844
2022-03-16 18:15:44,071.071 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.44963449920337
2022-03-16 18:15:58,981.981 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019483480602502823
2022-03-16 18:15:58,981.981 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:15:58,981.981 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bowl', 'of', 'sh', '##ree', '##ded', '[MASK]', '##s', 'in', 'milk', 'in', 'and', 'a', 'half', 'eaten', '[MASK]', '##nut', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:15:58,996.996 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'table', 'hole', 'plate', 'spoon', 'bowl', 'cereal', 'chocolate', 'handle', 'reflection', 'food', 'cream', 'nut', 'napkin', 'fork', 'cup', 'glass', 'milk', 'paper', 'white', 'ice', 'dessert', 'desert', 'dish', 'cookie', 'light', 'next', 'chip', 'rim', 'container', 'top', 'half', 'eaten', 'coffee', 'line', 'knife', 'bacon', 'liquid', 'box', 'small', 'sugar', 'bottom', 'blue', 'slice', 'hand', 'drink', 'mug', 'almond', 'couple', 'leg']
2022-03-16 18:16:14,907.907 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'half', 'table', 'food', 'paper', 'wood', 'bowl', 'hole', 'handle', 'plate', 'meat', 'milk', 'pen', 'chocolate', 'eaten', 'spoon']
2022-03-16 18:18:38,571.571 2829:trainer.py:487 do_train_dict(): eta: 17:34:12  iter: 28900  speed: 292.8 images/sec  total_norm: 139.8740 (142.7727)  loss: 146.2504 (147.4974)  masked_loss: 1.5786 (1.6387)  tag_loss: 144.3844 (145.8588)  time: 1.4325 (1.7487)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4274 (1.7434)  save_time: 8.8805 (21.7526)  lr: 0.000056  max mem: 26307
2022-03-16 18:18:38,932.932 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7058823704719543
2022-03-16 18:18:38,933.933 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 140.38246154785156
2022-03-16 18:18:38,933.933 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.4415922888394
2022-03-16 18:18:53,793.793 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019552214071154594
2022-03-16 18:18:53,793.793 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:18:53,794.794 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'tray', 'filled', 'with', 'po', '[MASK]', '##gra', '[MASK]', '##s', 'and', 'other', 'cut', 'fruits', 'to', 'eat', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:18:53,809.809 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['fruit', 'apple', 'stem', 'table', '[UNK]', 'bunch', 'pile', 'red', 'spot', 'pear', 'plate', 'bowl', 'banana', 'carrot', 'potato', 'top', 'full', 'vegetable', 'strawberry', 'flower', 'other', 'orange', 'onion', 'next', 'group', 'many', 'writing', 'board', 'paper', 'box', 'tray', 'wall', 'different', 'cardboard', 'variety', 'berry', 'various', 'grape', 'container', 'close', 'hole', 'end', 'bag', 'plastic', 'ripe', 'sign', 'white', 'skin', 'fresh', 'large']
2022-03-16 18:19:09,632.632 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'other', 'cut', 'orange', 'bowl', 'fruit', 'plastic', 'apple', 'stem', 'container', 'tray', 'banana']
2022-03-16 18:21:33,653.653 2829:trainer.py:487 do_train_dict(): eta: 17:31:32  iter: 29000  speed: 292.4 images/sec  total_norm: 138.8702 (141.7669)  loss: 148.8365 (150.4626)  masked_loss: 1.5171 (1.5620)  tag_loss: 147.5899 (148.9006)  time: 1.4336 (1.7507)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4282 (1.7455)  save_time: 8.8805 (21.7526)  lr: 0.000056  max mem: 26307
2022-03-16 18:21:34,012.012 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-16 18:21:34,013.013 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 128.46090698242188
2022-03-16 18:21:34,013.013 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.4574932216369
2022-03-16 18:21:49,079.079 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019599059596657753
2022-03-16 18:21:49,079.079 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:21:49,080.080 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'on', 'a', 'motorcycle', 'holding', '[MASK]', 'dog', 'while', 'looking', 'down', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:21:49,095.095 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'tree', 'man', 'jacket', 'dog', 'head', 'bush', 'hand', 'grass', 'stripe', 'face', 'shirt', 'background', 'motorcycle', '[UNK]', 'road', 'mirror', 'sky', 'seat', 'house', 'vest', 'glasses', 'bike', 'bag', 'ear', 'arm', 'building', 'window', 'back', 'pole', 'fence', 'car', 'coat', 'wheel', 'tire', 'shoulder', 'person', 'mouth', 'nose', 'leg', 'shoe', 'hood', 'trunk', 'collar', 'ground', 'jean', 'sidewalk', 'strap', 'chair', 'harness']
2022-03-16 18:22:05,030.030 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'face', 'road', 'hair', 'arm', 'tree', 'shirt', 'dog', 'background', 'nose', 'mirror', 'grass', 'bush', 'jacket', 'fence', 'motorcycle', 'vest', 'stripe']
2022-03-16 18:24:28,534.534 2829:trainer.py:487 do_train_dict(): eta: 17:28:51  iter: 29100  speed: 292.8 images/sec  total_norm: 142.3582 (144.3920)  loss: 146.1716 (149.5215)  masked_loss: 1.5811 (1.6037)  tag_loss: 144.0772 (147.9178)  time: 1.4338 (1.7489)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4289 (1.7439)  save_time: 8.8805 (21.7526)  lr: 0.000056  max mem: 26307
2022-03-16 18:24:28,897.897 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5142857432365417
2022-03-16 18:24:28,897.897 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 147.3625030517578
2022-03-16 18:24:28,898.898 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.46819223116522
2022-03-16 18:24:44,114.114 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01960493065416813
2022-03-16 18:24:44,114.114 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:24:44,115.115 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'person', '[MASK]', 'on', 'the', 'snow', 'covered', 'mountain', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:24:44,130.130 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', '[UNK]', 'jacket', 'snow', 'sky', 'person', 'helmet', 'ground', 'head', 'man', 'glove', 'ski', 'pole', 'coat', 'skier', 'hand', 'face', 'foot', 'stripe', 'rock', 'hill', 'arm', 'hood', 'snowy', 'slope', 'leg', 'hat', 'pine', 'boot', 'leaf', 'mountain', 'board', 'top', 'steep', 'logo', 'plant', 'side', 'downhill', 'bush', 'stick', 'jump', 'air', 'design', 'branch', 'backpack', 'day', 'suit', 'track', 'poles', 'forest']
2022-03-16 18:25:00,133.133 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'ground', 'person', 'mountain', 'plant', 'foot', 'tree', 'sky', 'clothes', 'snow', 'coat', 'pole', 'jacket', 'pine', 'logo', 'ski', 'boot', 'helmet', 'glove', 'stripe', 'skier']
2022-03-16 18:27:23,560.560 2829:trainer.py:487 do_train_dict(): eta: 17:26:11  iter: 29200  speed: 292.5 images/sec  total_norm: 141.7175 (143.5716)  loss: 149.6364 (148.0433)  masked_loss: 1.5976 (1.5875)  tag_loss: 148.1038 (146.4559)  time: 1.4328 (1.7503)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4276 (1.7451)  save_time: 8.8805 (21.7526)  lr: 0.000056  max mem: 26307
2022-03-16 18:27:23,921.921 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-16 18:27:23,921.921 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 115.51433563232422
2022-03-16 18:27:23,922.922 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.48557238529973
2022-03-16 18:27:38,955.955 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019604241475462914
2022-03-16 18:27:38,955.955 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:27:38,955.955 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'motorcycle', 'parked', 'in', 'a', 'city', 'street', 'with', 'houses', '[MASK]', 'the', 'background', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:27:38,971.971 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['motorcycle', 'tire', 'building', 'window', 'bike', 'tree', 'seat', 'sign', 'wheel', 'roof', 'car', 'mirror', '[UNK]', 'windshield', 'ground', 'road', 'street', 'sky', 'engine', 'pipe', 'light', 'house', 'logo', 'fender', 'brick', 'pole', 'exhaust', 'gas', 'leaf', 'chimney', 'handle', 'door', 'parked', 'black', 'spoke', 'rim', 'side', 'grass', 'sidewalk', 'front', 'wall', 'next', 'box', 'city', 'tank', 'lot', 'parking', 'license', 'plate', 'bush']
2022-03-16 18:27:54,802.802 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'house', 'building', 'door', 'road', 'street', 'light', 'car', 'ground', 'seat', 'engine', 'window', 'tree', 'box', 'sign', 'background', 'roof', 'wheel', 'mirror', 'grass', 'leaf', 'garage', 'globe', 'bike', 'pipe', 'motorcycle', 'rim', 'tire', 'suv', 'windshield']
2022-03-16 18:30:18,591.591 2829:trainer.py:487 do_train_dict(): eta: 17:23:30  iter: 29300  speed: 292.5 images/sec  total_norm: 140.8023 (143.9691)  loss: 146.9014 (148.8958)  masked_loss: 1.5923 (1.5947)  tag_loss: 145.8006 (147.3011)  time: 1.4321 (1.7502)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4270 (1.7450)  save_time: 8.8805 (21.7526)  lr: 0.000056  max mem: 26307
2022-03-16 18:30:18,951.951 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.47058823704719543
2022-03-16 18:30:18,951.951 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 145.95120239257812
2022-03-16 18:30:18,951.951 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.48794665952929
2022-03-16 18:30:34,009.009 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019610542804002762
2022-03-16 18:30:34,010.010 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:30:34,010.010 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'delicious', 'looking', 'plate', 'of', 'fresh', 'bro', '##cco', '[MASK]', 'and', 'some', 'type', 'of', 'meat', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:30:34,026.026 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['plate', 'table', 'chicken', 'food', '[UNK]', 'meat', 'bread', 'fork', 'carrot', 'white', 'potato', 'handle', 'mushroom', 'crust', 'glass', 'knife', 'napkin', 'shadow', 'tomato', 'cup', 'fish', 'sauce', 'onion', 'stem', 'vegetable', 'bun', 'design', 'pizza', 'slice', 'bowl', 'cheese', 'sandwich', 'reflection', 'top', 'leaf', 'pepper', 'spoon', 'piece', 'logo', 'base', 'spot', 'ham', 'meal', 'sausage', 'light', 'bean', 'salad', 'wall', 'shrimp', 'bottle']
2022-03-16 18:30:49,883.883 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'cup', 'table', 'type', 'food', 'fresh', 'handle', 'plate', 'meat', 'stem', 'chicken', 'fork']
2022-03-16 18:33:13,729.729 2829:trainer.py:487 do_train_dict(): eta: 17:20:50  iter: 29400  speed: 292.3 images/sec  total_norm: 142.0972 (144.0886)  loss: 148.7812 (150.4367)  masked_loss: 1.6246 (1.5846)  tag_loss: 147.6292 (148.8521)  time: 1.4328 (1.7514)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0050)  time_gpu: 1.4277 (1.7462)  save_time: 8.8805 (21.7526)  lr: 0.000056  max mem: 26307
2022-03-16 18:33:14,089.089 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-16 18:33:14,090.090 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 137.72116088867188
2022-03-16 18:33:14,090.090 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.50300521850586
2022-03-16 18:33:29,295.295 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019644897431135178
2022-03-16 18:33:29,295.295 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:33:29,295.295 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'watches', 'as', 'a', 'bald', 'eagle', 'flies', 'by', 'her', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:33:29,311.311 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['head', 'hair', 'face', 'bird', 'tree', 'woman', 'mouth', 'wing', 'eagle', 'eye', 'feather', 'tail', 'beak', 'shirt', 'girl', 'nose', 'bush', 'foot', 'arm', 'neck', 'person', 'air', 'hand', 'ear', 'leg', '[UNK]', 'top', 'man', 'white', 'glove', 'jacket', 'black', 'teeth', 'sky', 'background', 'name', 'image', 'plant', 'strap', 'dress', 'watch', 'young', 'wrist', 'chest', 'shoulder', 'couple', 'jean', 'beautiful', 'large', 'necklace']
2022-03-16 18:33:45,204.204 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'face', 'woman', 'hair', 'mouth', 'post', 'eye', 'wing', 'tree', 'shirt', 'leg', 'nose', 'bird', 'tail', 'eagle', 'fence', 'bald', 'feather', 'beak']
2022-03-16 18:36:08,769.769 2829:trainer.py:487 do_train_dict(): eta: 17:18:09  iter: 29500  speed: 292.5 images/sec  total_norm: 139.5855 (142.6077)  loss: 146.1849 (146.5057)  masked_loss: 1.6106 (1.5761)  tag_loss: 144.9258 (144.9296)  time: 1.4323 (1.7504)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4269 (1.7452)  save_time: 8.8805 (21.7526)  lr: 0.000056  max mem: 26307
2022-03-16 18:36:09,129.129 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5714285969734192
2022-03-16 18:36:09,129.129 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 151.2039031982422
2022-03-16 18:36:09,129.129 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.50467484706157
2022-03-16 18:36:24,480.480 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01969200186431408
2022-03-16 18:36:24,480.480 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:36:24,481.481 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', 'older', 'model', 'delivery', 'truck', '[MASK]', 'in', 'front', '[MASK]', 'some', 'trees', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:36:24,496.496 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tire', '[UNK]', 'truck', 'grill', 'windshield', 'wheel', 'ground', 'bumper', 'mirror', 'window', 'tree', 'dirt', 'light', 'door', 'road', 'grass', 'front', 'sky', 'steering', 'bed', 'building', 'license', 'rim', 'hood', 'logo', 'roof', 'plate', 'pole', 'trailer', 'fence', 'car', 'person', 'number', 'mud', 'old', 'wall', 'next', 'white', 'large', 'sign', 'lot', 'emblem', 'bus', 'fender', 'man', 'side', 'vehicle', 'step', 'head', 'field']
2022-03-16 18:36:40,518.518 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'door', 'road', 'front', 'light', 'car', 'ground', 'model', 'cover', 'window', 'step', 'tree', 'sky', 'picture', 'truck', 'plate', 'wheel', 'mirror', 'grass', 'license', 'delivery', 'logo', 'steering', 'rim', 'tire', 'emblem', 'grill', 'windshield', 'bumper']
2022-03-16 18:39:04,228.228 2829:trainer.py:487 do_train_dict(): eta: 17:15:28  iter: 29600  speed: 291.8 images/sec  total_norm: 139.1917 (142.5527)  loss: 148.6858 (148.5452)  masked_loss: 1.4876 (1.5223)  tag_loss: 147.4291 (147.0229)  time: 1.4338 (1.7546)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4286 (1.7494)  save_time: 8.8805 (21.7526)  lr: 0.000055  max mem: 26307
2022-03-16 18:39:04,588.588 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 18:39:04,589.589 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.98666381835938
2022-03-16 18:39:04,589.589 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.51333668256046
2022-03-16 18:39:19,833.833 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019648294895887375
2022-03-16 18:39:19,833.833 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:39:19,833.833 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'kitchen', 'setting', 'with', 'white', '[MASK]', '##lian', '##ce', 'and', '[MASK]', 'counters', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:39:19,849.849 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['refrigerator', 'wall', 'kitchen', '[UNK]', 'floor', 'door', 'cabinet', 'window', 'handle', 'ceiling', 'box', 'blind', 'microwave', 'light', 'drawer', 'sink', 'towel', 'stove', 'cord', 'tile', 'bag', 'outlet', 'shelf', 'oven', 'white', 'vent', 'magnet', 'basket', 'bottle', 'room', 'table', 'chair', 'mirror', 'paper', 'switch', 'top', 'fridge', 'can', 'knob', 'pot', 'bowl', 'trash', 'counter', 'fan', 'rack', 'cup', 'hood', 'maker', 'reflection', 'picture']
2022-03-16 18:39:35,745.745 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'white', 'door', 'light', 'floor', 'wall', 'chair', 'window', 'box', 'wood', 'kitchen', 'handle', 'cabinet', 'ceiling', 'blind', 'sink', 'pot', 'towel', 'trash', 'lid', 'outlet', 'mat', 'tile', 'stove', 'refrigerator', 'microwave', 'vent', 'rug']
2022-03-16 18:41:59,396.396 2829:trainer.py:487 do_train_dict(): eta: 17:12:48  iter: 29700  speed: 292.3 images/sec  total_norm: 140.2098 (144.2407)  loss: 148.6682 (147.7390)  masked_loss: 1.5420 (1.5807)  tag_loss: 146.2836 (146.1583)  time: 1.4330 (1.7517)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4279 (1.7465)  save_time: 8.8805 (21.7526)  lr: 0.000055  max mem: 26307
2022-03-16 18:41:59,757.757 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.39393940567970276
2022-03-16 18:41:59,757.757 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.75442504882812
2022-03-16 18:41:59,758.758 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.52547696772838
2022-03-16 18:42:15,191.191 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019615011289715767
2022-03-16 18:42:15,191.191 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:42:15,191.191 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'qu', '##aint', 'kitchen', 'with', '[MASK]', 'flowers', 'on', 'a', 'small', 'central', 'table', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:42:15,207.207 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'kitchen', 'table', '[UNK]', 'refrigerator', 'chair', 'floor', 'cabinet', 'door', 'microwave', 'handle', 'rug', 'plate', 'ceiling', 'curtain', 'light', 'bowl', 'flower', 'bottle', 'oven', 'drawer', 'leg', 'window', 'sink', 'towel', 'room', 'shelf', 'stove', 'pot', 'vase', 'cloth', 'cup', 'plant', 'picture', 'dish', 'magnet', 'maker', 'design', 'coffee', 'cushion', 'clock', 'tray', 'blanket', 'mirror', 'stool', 'fruit', 'glass', 'top', 'dining', 'knob']
2022-03-16 18:42:31,113.113 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'small', 'door', 'central', 'cup', 'board', 'floor', 'table', 'wall', 'chair', 'kitchen', 'bowl', 'handle', 'pink', 'clock', 'plate', 'cabinet', 'cutting', 'flower', 'sink', 'cloth', 'toy', 'towel', 'tile', 'rack', 'jar', 'stove', 'dresser', 'magnet', 'knob', 'oven', 'refrigerator', 'microwave', 'vase', 'rug']
03-16 18:42:51.561 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 18:42:51.561 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 18:42:52.914 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 18:44:54,805.805 2829:trainer.py:487 do_train_dict(): eta: 17:10:07  iter: 29800  speed: 291.9 images/sec  total_norm: 139.4920 (143.6648)  loss: 144.2754 (146.0038)  masked_loss: 1.6383 (1.6250)  tag_loss: 142.6230 (144.3788)  time: 1.4329 (1.7540)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4278 (1.7489)  save_time: 8.8805 (21.7526)  lr: 0.000055  max mem: 26307
2022-03-16 18:44:55,166.166 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-16 18:44:55,170.170 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 107.9897689819336
2022-03-16 18:44:55,171.171 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.54795055325613
2022-03-16 18:45:10,660.660 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019662857055664062
2022-03-16 18:45:10,661.661 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:45:10,661.661 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'close', 'commons', 'of', 'a', '[MASK]', 'in', 'front', 'of', 'a', 'refrigerator', 'with', 'its', 'door', 'open', 'taxonomy', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:45:10,677.677 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bottle', 'hand', 'boy', 'shirt', 'eye', 'hair', 'refrigerator', 'shelf', 'finger', 'nose', 'label', '[UNK]', 'apple', 'beer', 'door', 'wall', 'bag', 'face', 'head', 'person', 'wine', 'can', 'cap', 'glass', 'mouth', 'rack', 'ear', 'drawer', 'young', 'floor', 'cooler', 'lid', 'handle', 'container', 'fridge', 'bin', 'jar', 'ball', 'top', 'child', 'button', 'open', 'arm', 'sleeve', 'egg', 'front', 'thumb', 'kid', 'man', 'drink']
2022-03-16 18:45:26,611.611 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['hand', 'open', 'door', 'front', 'close', 'hair', 'mouth', 'person', 'wall', 'boy', 'eye', 'shirt', 'label', 'finger', 'nose', 'wine', 'bag', 'beer', 'bottle', 'apple', 'shelf', 'lid', 'drawer', 'jar', 'refrigerator']
2022-03-16 18:47:49,944.944 2829:trainer.py:487 do_train_dict(): eta: 17:07:26  iter: 29900  speed: 292.3 images/sec  total_norm: 141.0909 (142.3764)  loss: 146.9476 (147.9617)  masked_loss: 1.5776 (1.6118)  tag_loss: 144.7945 (146.3500)  time: 1.4311 (1.7514)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0051)  time_gpu: 1.4258 (1.7458)  save_time: 8.8805 (21.7526)  lr: 0.000055  max mem: 26307
2022-03-16 18:47:50,304.304 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-16 18:47:50,304.304 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.07025146484375
2022-03-16 18:47:50,305.305 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.55842314402263
2022-03-16 18:48:05,803.803 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019699733704328537
2022-03-16 18:48:05,804.804 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:48:05,804.804 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'red', 'fire', 'hydra', '##nt', 'leaking', 'water', 'all', 'over', 'a', 'street', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:48:05,819.819 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['ground', 'chain', 'fire', 'water', 'leaf', 'puddle', 'paint', '[UNK]', 'reflection', 'base', 'red', 'mud', 'grass', 'hole', 'top', 'wall', 'pole', 'moss', 'rock', 'branch', 'stick', 'shadow', 'cap', 'curb', 'light', 'tree', 'yellow', 'dirt', 'dirty', 'pond', 'bolt', 'trash', 'flower', 'next', 'pipe', 'rusty', 'sign', 'old', 'object', 'paper', 'trunk', 'metal', 'open', 'drain', 'building', 'road', 'area', 'post', 'graffiti', 'leg']
2022-03-16 18:48:21,674.674 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'water', 'street', 'red', 'fire', 'ground', 'chain', 'grass', 'leaf', 'reflection', 'gravel', 'drain', 'hose', 'puddle']
2022-03-16 18:50:45,483.483 2829:trainer.py:487 do_train_dict(): eta: 17:04:45  iter: 30000  speed: 291.7 images/sec  total_norm: 138.8714 (141.9345)  loss: 146.1889 (147.5258)  masked_loss: 1.6131 (1.5767)  tag_loss: 144.6801 (145.9491)  time: 1.4339 (1.7554)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4285 (1.7503)  save_time: 8.8805 (21.7526)  lr: 0.000055  max mem: 26307
2022-03-16 18:50:45,485.485 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt
2022-03-16 18:50:54,698.698 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.65625
2022-03-16 18:50:54,698.698 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 159.35382080078125
2022-03-16 18:50:54,698.698 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.56384276076409
2022-03-16 18:51:10,417.417 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019727135077118874
2022-03-16 18:51:10,417.417 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:51:10,418.418 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'photo', '[MASK]', 'a', 'wooden', 'bench', 'underneath', 'a', 'tree', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:51:10,433.433 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'sky', 'bench', 'trunk', 'ground', 'park', 'road', 'grass', 'bus', 'leg', 'pole', 'car', 'sidewalk', 'person', 'street', 'building', '[UNK]', 'dirt', 'back', 'window', 'shadow', 'truck', 'woman', 'branch', 'light', 'man', 'tire', 'graffiti', 'next', 'lot', 'wall', 'wheel', 'sign', 'van', 'front', 'yellow', 'hat', 'couple', 'curb', 'large', 'side', 'dress', 'pavement', 'post', 'city', 'fence', 'line', 'top', 'flower', 'parking']
2022-03-16 18:51:26,346.346 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'building', 'road', 'park', 'street', 'car', 'ground', 'post', 'tree', 'sky', 'bus', 'leg', 'truck', 'wooden', 'shadow', 'grass', 'photo', 'pole', 'bench', 'dirt', 'trunk']
2022-03-16 18:53:49,144.144 2829:trainer.py:487 do_train_dict(): eta: 17:02:14  iter: 30100  speed: 278.8 images/sec  total_norm: 142.2289 (143.9444)  loss: 145.6458 (147.4455)  masked_loss: 1.6090 (1.6459)  tag_loss: 144.0788 (145.7996)  time: 1.4321 (1.8366)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4270 (1.7431)  save_time: 8.8421 (19.6009)  lr: 0.000055  max mem: 26307
2022-03-16 18:53:49,509.509 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6111111044883728
2022-03-16 18:53:49,509.509 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 147.583740234375
2022-03-16 18:53:49,509.509 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.57026211315433
2022-03-16 18:54:05,157.157 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01978553831577301
2022-03-16 18:54:05,157.157 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:54:05,157.157 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'hum', '##mus', '[MASK]', 'pit', '##a', 'chips', 'on', 'a', 'plate', 'with', 'fa', '##la', '##fe', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:54:05,173.173 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', 'food', 'plate', 'egg', '[UNK]', 'meat', 'paper', 'sausage', 'bread', 'tomato', 'napkin', 'shadow', 'bowl', 'potato', 'sauce', 'cookie', 'cup', 'shirt', 'container', 'bag', 'floor', 'cheese', 'water', 'breakfast', 'vegetable', 'rice', 'butter', 'white', 'different', 'bottle', 'chicken', 'person', 'glass', 'phone', 'other', 'fork', 'beef', 'bun', 'mushroom', 'full', 'yellow', 'green', 'spoon', 'knife', 'dinner', 'spot', 'meal', 'side', 'chip', 'wall']
2022-03-16 18:54:21,133.133 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'table', 'food', 'paper', 'plate', 'meat', 'bread', 'egg', 'sausage']
2022-03-16 18:56:44,723.723 2829:trainer.py:487 do_train_dict(): eta: 16:59:33  iter: 30200  speed: 291.6 images/sec  total_norm: 141.1012 (143.4288)  loss: 146.5415 (147.2401)  masked_loss: 1.5724 (1.5607)  tag_loss: 144.9308 (145.6794)  time: 1.4325 (1.7557)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4272 (1.7505)  save_time: 8.8421 (19.6009)  lr: 0.000055  max mem: 26307
2022-03-16 18:56:45,085.085 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 18:56:45,086.086 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 169.15689086914062
2022-03-16 18:56:45,086.086 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.56280526390957
2022-03-16 18:57:00,669.669 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019794346764683723
2022-03-16 18:57:00,669.669 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:57:00,670.670 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', '[MASK]', 'on', 'ski', '##s', 'stands', 'in', 'the', 'snow', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:57:00,685.685 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['jacket', '[UNK]', 'snow', 'ski', 'tree', 'pole', 'glove', 'ground', 'person', 'boot', 'sky', 'coat', 'head', 'helmet', 'skier', 'hand', 'hat', 'man', 'mountain', 'hill', 'woman', 'child', 'foot', 'boy', 'slope', 'snowy', 'girl', 'track', 'cloud', 'leg', 'arm', 'lift', 'face', 'building', 'background', 'backpack', 'branch', 'top', 'bush', 'poles', 'trunk', 'young', 'sign', 'skiing', 'hair', 'stripe', 'wire', 'hood', 'small', 'roof']
2022-03-16 18:57:16,711.711 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'car', 'ground', 'person', 'arm', 'chair', 'tree', 'sign', 'sky', 'snow', 'lift', 'pole', 'jacket', 'ski', 'boot', 'helmet', 'glove', 'skier']
2022-03-16 18:59:40,410.410 2829:trainer.py:487 do_train_dict(): eta: 16:56:53  iter: 30300  speed: 291.4 images/sec  total_norm: 140.2482 (141.6203)  loss: 145.6944 (147.3203)  masked_loss: 1.6413 (1.6330)  tag_loss: 144.1128 (145.6873)  time: 1.4321 (1.7569)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4272 (1.7517)  save_time: 8.8421 (19.6009)  lr: 0.000054  max mem: 26307
2022-03-16 18:59:40,771.771 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-16 18:59:40,771.771 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.28025817871094
2022-03-16 18:59:40,771.771 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.57024020897715
2022-03-16 18:59:56,489.489 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019815821200609207
2022-03-16 18:59:56,490.490 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 18:59:56,490.490 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'bunch', 'of', 'people', 'sits', 'valkyrie', 'a', 'table', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 18:59:56,505.505 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', 'wall', 'shirt', 'bottle', 'plate', 'man', 'person', 'woman', 'hair', 'pizza', 'picture', 'fork', 'head', '[UNK]', 'knife', 'glasses', 'group', 'watch', 'frame', 'girl', 'straw', 'face', 'mirror', 'water', 'restaurant', 'glass', 'food', 'hand', 'sunglasses', 'label', 'lid', 'napkin', 'camera', 'beer', 'can', 'cup', 'spoon', 'hat', 'chair', 'pitcher', 'top', 'cap', 'phone', 'juice', 'menu', 'box', 'light', 'salt', 'sign', 'necklace']
2022-03-16 19:00:12,397.397 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'water', 'woman', 'hair', 'girl', 'person', 'table', 'wall', 'shirt', 'picture', 'drink', 'frame', 'plate', 'bottle', 'cap', 'glasses', 'bunch', 'fork', 'pizza', 'juice', 'lid', 'menu', 'sunglasses']
2022-03-16 19:02:35,895.895 2829:trainer.py:487 do_train_dict(): eta: 16:54:11  iter: 30400  speed: 291.8 images/sec  total_norm: 141.1617 (145.2466)  loss: 148.5276 (149.5912)  masked_loss: 1.5957 (1.5880)  tag_loss: 146.8653 (148.0031)  time: 1.4327 (1.7549)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4274 (1.7497)  save_time: 8.8421 (19.6009)  lr: 0.000054  max mem: 26307
2022-03-16 19:02:36,257.257 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5142857432365417
2022-03-16 19:02:36,257.257 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 134.93768310546875
2022-03-16 19:02:36,258.258 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.5809973919978
2022-03-16 19:02:51,894.894 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01982649601995945
2022-03-16 19:02:51,895.895 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:02:51,895.895 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'red', 'yellow', '[MASK]', 'blue', 'airplane', 'is', 'sitting', 'on', 'the', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:02:51,911.911 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'nose', 'airplane', 'building', 'ground', 'wheel', 'cockpit', 'door', 'engine', 'wing', 'tree', '[UNK]', 'background', 'sky', 'stripe', 'tail', 'windshield', 'line', 'front', 'shadow', 'bridge', 'airport', 'runway', 'tire', 'sign', 'walkway', 'vehicle', 'pole', 'blue', 'fence', 'city', 'large', 'landing', 'truck', 'cover', 'plane', 'bush', 'man', 'logo', 'cone', 'grass', 'railing', 'flag', 'light', 'orange', 'display', 'fuselage', 'cart', 'box', 'white']
2022-03-16 19:03:07,809.809 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['line', 'building', 'door', 'front', 'red', 'ground', 'blue', 'bridge', 'cover', 'engine', 'window', 'wing', 'tree', 'box', 'sky', 'yellow', 'background', 'nose', 'truck', 'wheel', 'tail', 'trailer', 'runway', 'cart', 'tire', 'airplane', 'cockpit', 'luggage', 'stripe', 'windshield']
2022-03-16 19:05:31,627.627 2829:trainer.py:487 do_train_dict(): eta: 16:51:30  iter: 30500  speed: 291.4 images/sec  total_norm: 140.5835 (143.5303)  loss: 147.1093 (147.3752)  masked_loss: 1.4450 (1.4994)  tag_loss: 145.9678 (145.8757)  time: 1.4332 (1.7573)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4278 (1.7521)  save_time: 8.8421 (19.6009)  lr: 0.000054  max mem: 26307
2022-03-16 19:05:31,988.988 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-16 19:05:31,988.988 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 124.09210205078125
2022-03-16 19:05:31,989.989 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.58494514265871
2022-03-16 19:05:47,645.645 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019816165789961815
2022-03-16 19:05:47,645.645 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:05:47,646.646 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'half', 'of', 'a', 'sandwich', '[MASK]', 'held', 'in', 'hand', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:05:47,661.661 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sandwich', 'meat', 'bread', 'hand', 'thumb', 'person', 'paper', 'finger', '[UNK]', 'table', 'half', 'chicken', 'bun', 'food', 'crust', 'shadow', 'bottom', 'onion', 'dog', 'background', 'wall', 'cheese', 'napkin', 'man', 'piece', 'nail', 'eaten', 'palm', 'cup', 'close', 'top', 'hot', 'large', 'sub', 'arm', 'basket', 'line', 'hamburger', 'cut', 'white', 'roll', 'shirt', 'wrist', 'sleeve', 'pepper', 'handle', 'chair', 'ground', 'tomato', 'beef']
2022-03-16 19:06:03,492.492 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['hand', 'half', 'person', 'food', 'paper', 'bottom', 'finger', 'shadow', 'meat', 'thumb', 'bread', 'sandwich']
2022-03-16 19:08:27,302.302 2829:trainer.py:487 do_train_dict(): eta: 16:48:49  iter: 30600  speed: 291.4 images/sec  total_norm: 139.9355 (143.5255)  loss: 144.2353 (145.9365)  masked_loss: 1.5490 (1.5329)  tag_loss: 142.9350 (144.4036)  time: 1.4322 (1.7568)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4270 (1.7516)  save_time: 8.8421 (19.6009)  lr: 0.000054  max mem: 26307
2022-03-16 19:08:27,663.663 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-16 19:08:27,663.663 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 153.99993896484375
2022-03-16 19:08:27,663.663 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.5893248144889
2022-03-16 19:08:43,351.351 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019861236214637756
2022-03-16 19:08:43,351.351 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:08:43,352.352 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', '[MASK]', 'photographs', '[MASK]', 'comparing', 'the', 'city', 'streets', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:08:43,367.367 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'sky', 'cloud', 'pole', 'road', 'building', 'street', 'car', 'window', 'person', 'sign', 'light', 'sidewalk', '[UNK]', 'tire', 'line', 'roof', 'house', 'ground', 'bench', 'wall', 'picture', 'wheel', 'bus', 'clock', 'photo', 'truck', 'fence', 'woman', 'door', 'letter', 'bush', 'man', 'statue', 'church', 'tower', 'traffic', 'shirt', 'post', 'white', 'van', 'monument', 'hat', 'front', 'grass', 'flag', 'cross', 'mountain', 'snow', 'shadow']
2022-03-16 19:08:59,277.277 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'back', 'city', 'house', 'church', 'building', 'road', 'street', 'light', 'car', 'person', 'wall', 'window', 'tree', 'tower', 'sky', 'truck', 'clock', 'mirror', 'cloud', 'statue', 'pole', 'tire']
2022-03-16 19:11:23,330.330 2829:trainer.py:487 do_train_dict(): eta: 16:46:08  iter: 30700  speed: 290.9 images/sec  total_norm: 140.1153 (141.1037)  loss: 148.4659 (148.7408)  masked_loss: 1.5450 (1.5845)  tag_loss: 147.0765 (147.1564)  time: 1.4331 (1.7603)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4279 (1.7551)  save_time: 8.8421 (19.6009)  lr: 0.000054  max mem: 26307
2022-03-16 19:11:23,691.691 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6285714507102966
2022-03-16 19:11:23,691.691 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 158.290283203125
2022-03-16 19:11:23,691.691 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.5810612393664
2022-03-16 19:11:39,484.484 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019849155098199844
2022-03-16 19:11:39,485.485 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:11:39,485.485 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'lean', 'back', 'style', 'motorcycle', 'with', 'saddle', '##bags', 'upstairs', 'outside', 'a', 'building', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:11:39,501.501 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['motorcycle', 'tire', 'building', 'bike', 'car', 'sky', 'wheel', 'light', 'window', '[UNK]', 'street', 'road', 'line', 'mirror', 'seat', 'tree', 'door', 'sign', 'pole', 'suv', 'handle', 'lot', 'wall', 'pipe', 'engine', 'parking', 'sidewalk', 'van', 'curb', 'ground', 'roof', 'fender', 'cloud', 'gas', 'windshield', 'fence', 'truck', 'tank', 'man', 'next', 'flag', 'license', 'rim', 'plate', 'person', 'shirt', 'helmet', 'shadow', 'house', 'parked']
2022-03-16 19:11:55,374.374 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'building', 'road', 'street', 'light', 'car', 'style', 'seat', 'lot', 'van', 'chair', 'window', 'tree', 'sign', 'sky', 'roof', 'bag', 'handle', 'wheel', 'mirror', 'parking', 'bike', 'pipe', 'motorcycle', 'tire', 'exhaust', 'suv', 'fender']
03-16 19:12:53.013 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 19:12:53.013 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 19:12:54.226 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 88}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 19:14:19,149.149 2829:trainer.py:487 do_train_dict(): eta: 16:43:27  iter: 30800  speed: 291.2 images/sec  total_norm: 138.5450 (140.4120)  loss: 146.8929 (145.0909)  masked_loss: 1.5597 (1.5646)  tag_loss: 145.1104 (143.5263)  time: 1.4335 (1.7582)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4282 (1.7531)  save_time: 8.8421 (19.6009)  lr: 0.000054  max mem: 26307
2022-03-16 19:14:19,510.510 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.47058823704719543
2022-03-16 19:14:19,510.510 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 131.0284423828125
2022-03-16 19:14:19,510.510 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.60162789304665
2022-03-16 19:14:35,509.509 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01989004947245121
2022-03-16 19:14:35,510.510 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:14:35,510.510 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'nun', 'sharing', '[MASK]', 'with', 'two', 'young', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:14:35,525.525 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'shirt', 'wall', 'hand', 'head', 'man', 'picture', 'face', 'nose', 'ear', 'person', 'woman', 'table', 'window', 'railing', '[UNK]', 'glass', 'boy', 'eye', 'mouth', 'room', 'arm', 'finger', 'phone', 'frame', 'glasses', 'ceiling', 'cup', 'food', 'chair', 'collar', 'light', 'plate', 'watch', 'girl', 'cell', 'neck', 'bottle', 'necklace', 'jacket', 'wrist', 'laptop', 'bowl', 'fork', 'cake', 'dress', 'tie', 'bracelet', 'logo', 'ring']
2022-03-16 19:14:51,484.484 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'young', 'light', 'woman', 'cup', 'television', 'hair', 'person', 'table', 'wall', 'boy', 'bar', 'sign', 'shirt', 'picture', 'nose', 'ear', 'handle', 'plate', 'knife', 'blade', 'pan', 'glasses', 'pizza', 'tray', 'slice', 'railing', 'nun', 'crust']
2022-03-16 19:17:15,137.137 2829:trainer.py:487 do_train_dict(): eta: 16:40:46  iter: 30900  speed: 290.9 images/sec  total_norm: 142.5558 (145.7625)  loss: 150.8870 (148.2350)  masked_loss: 1.5747 (1.6199)  tag_loss: 149.3488 (146.6151)  time: 1.4326 (1.7598)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4273 (1.7546)  save_time: 8.8421 (19.6009)  lr: 0.000053  max mem: 26307
2022-03-16 19:17:15,497.497 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7272727489471436
2022-03-16 19:17:15,498.498 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 157.89096069335938
2022-03-16 19:17:15,498.498 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.6092076086229
2022-03-16 19:17:31,368.368 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01991117373108864
2022-03-16 19:17:31,368.368 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:17:31,369.369 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'empty', 'kitchen', 'with', 'white', 'cabinets', '[MASK]', 'black', 'counters', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:17:31,384.384 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cabinet', 'kitchen', 'stove', 'wall', '[UNK]', 'floor', 'oven', 'drawer', 'door', 'top', 'knob', 'handle', 'table', 'outlet', 'window', 'ceiling', 'pan', 'sink', 'tile', 'light', 'hood', 'wood', 'pot', 'refrigerator', 'black', 'board', 'counter', 'white', 'pipe', 'towel', 'doorway', 'lid', 'kettle', 'wooden', 'cutting', 'island', 'shelf', 'picture', 'rug', 'rack', 'plate', 'shadow', 'leg', 'large', 'clock', 'vent', 'old', 'paper', 'switch', 'room']
2022-03-16 19:17:47,420.420 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'black', 'white', 'top', 'door', 'light', 'floor', 'table', 'wall', 'window', 'kitchen', 'handle', 'cabinet', 'ceiling', 'pan', 'beam', 'sink', 'cord', 'drawer', 'outlet', 'stove', 'knob', 'oven']
2022-03-16 19:20:10,928.928 2829:trainer.py:487 do_train_dict(): eta: 16:38:05  iter: 31000  speed: 291.3 images/sec  total_norm: 139.5078 (142.2444)  loss: 150.1836 (150.2809)  masked_loss: 1.6066 (1.6128)  tag_loss: 148.4965 (148.6682)  time: 1.4323 (1.7580)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0051)  time_gpu: 1.4270 (1.7524)  save_time: 8.8421 (19.6009)  lr: 0.000053  max mem: 26307
2022-03-16 19:20:11,289.289 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-16 19:20:11,289.289 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.26344299316406
2022-03-16 19:20:11,289.289 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.60965793769076
2022-03-16 19:20:27,251.251 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01990804262459278
2022-03-16 19:20:27,252.252 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:20:27,252.252 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'banana', 'was', 'peeled', '[MASK]', 'had', 'a', '[MASK]', 'taken', 'out', 'of', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:20:27,268.268 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['banana', 'hand', 'thumb', 'person', 'peel', 'finger', 'background', 'nail', 'man', '[UNK]', 'shirt', 'stem', 'yellow', 'sleeve', 'wall', 'palm', 'ripe', 'window', 'flower', 'picture', 'ring', 'peeled', 'arm', 'bunch', 'logo', 'wrist', 'reflection', 'woman', 'face', 'handle', 'large', 'end', 'letter', 'orange', 'writing', 'half', 'paper', 'white', 'jacket', 'wire', 'object', 'spot', 'bananas', 'line', 'hole', 'close', 'top', 'hair', 'jean', 'label']
2022-03-16 19:20:43,226.226 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['hand', 'end', 'person', 'finger', 'bite', 'thumb', 'stem', 'blanket', 'peel', 'banana']
2022-03-16 19:23:07,019.019 2829:trainer.py:487 do_train_dict(): eta: 16:35:24  iter: 31100  speed: 290.8 images/sec  total_norm: 140.4396 (141.3662)  loss: 146.1037 (146.7042)  masked_loss: 1.5495 (1.5990)  tag_loss: 144.8077 (145.1052)  time: 1.4335 (1.7609)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4282 (1.7556)  save_time: 8.8421 (19.6009)  lr: 0.000053  max mem: 26307
2022-03-16 19:23:07,380.380 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-16 19:23:07,380.380 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 135.91114807128906
2022-03-16 19:23:07,380.380 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.61213259819226
2022-03-16 19:23:23,430.430 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019967107102274895
2022-03-16 19:23:23,430.430 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:23:23,431.431 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '[MASK]', 'pie', 'and', 'a', 'fork', 'rest', 'on', 'a', 'yellow', 'plate', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:23:23,446.446 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cake', 'plate', 'table', 'fork', 'handle', '[UNK]', 'piece', 'plastic', 'paper', 'spoon', 'bag', 'food', 'napkin', 'white', 'ice', 'dessert', 'light', 'reflection', 'object', 'knife', 'cream', 'layer', 'hole', 'top', 'bowl', 'shadow', 'cheese', 'line', 'blue', 'spot', 'eaten', 'floor', 'cup', 'close', 'slice', 'chocolate', 'next', 'bread', 'sauce', 'container', 'bottle', 'flower', 'half', 'pie', 'logo', 'hand', 'item', 'cloth', 'box', 'delicious']
2022-03-16 19:23:39,383.383 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'light', 'rest', 'table', 'paper', 'yellow', 'handle', 'plate', 'bottle', 'fork', 'cake', 'pie', 'slice', 'spoon']
2022-03-16 19:26:03,085.085 2829:trainer.py:487 do_train_dict(): eta: 16:32:42  iter: 31200  speed: 290.8 images/sec  total_norm: 140.9551 (143.5243)  loss: 146.7103 (146.4920)  masked_loss: 1.5391 (1.5932)  tag_loss: 144.9235 (144.8988)  time: 1.4334 (1.7608)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4283 (1.7556)  save_time: 8.8421 (19.6009)  lr: 0.000053  max mem: 26307
2022-03-16 19:26:03,447.447 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 19:26:03,447.447 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 124.42505645751953
2022-03-16 19:26:03,447.447 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.62576716852645
2022-03-16 19:26:19,448.448 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0199777539819479
2022-03-16 19:26:19,449.449 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:26:19,449.449 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'standing', 'with', 'a', '[MASK]', 'phone', 'strapped', 'to', 'his', 'ear', '.', 'hay', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:26:19,464.464 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'wall', 'shirt', 'man', 'lamp', 'door', 'face', 'hair', 'ceiling', 'nose', '[UNK]', 'eye', 'head', 'mouth', 'picture', 'cord', 'wire', 'shade', 'outlet', 'doorway', 'arm', 'ear', 'switch', 'glasses', 'light', 'book', 'frame', 'room', 'sleeve', 'box', 'beard', 'young', 'table', 'logo', 'jean', 'finger', 'cabinet', 'remote', 'phone', 'green', 'game', 'wii', 'tag', 'chin', 'floor', 'person', 'can', 'chair', 'couch', 'boy']
2022-03-16 19:26:35,377.377 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'face', 'band', 'book', 'door', 'hair', 'mouth', 'wall', 'arm', 'phone', 'eye', 'cell', 'shirt', 'teeth', 'nose', 'ear', 'chin', 'ceiling', 'switch', 'shade', 'sleeve', 'lamp', 'cord', 'outlet']
2022-03-16 19:28:59,175.175 2829:trainer.py:487 do_train_dict(): eta: 16:30:01  iter: 31300  speed: 290.8 images/sec  total_norm: 141.2020 (143.7188)  loss: 145.8860 (147.5371)  masked_loss: 1.6635 (1.6370)  tag_loss: 144.0593 (145.9001)  time: 1.4332 (1.7609)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4281 (1.7557)  save_time: 8.8421 (19.6009)  lr: 0.000053  max mem: 26307
2022-03-16 19:28:59,535.535 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6857143044471741
2022-03-16 19:28:59,536.536 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 172.20855712890625
2022-03-16 19:28:59,536.536 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.62044667286479
2022-03-16 19:29:15,639.639 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019985578954219818
2022-03-16 19:29:15,640.640 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:29:15,640.640 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bright', '[MASK]', 'plane', 'contrasts', 'the', 'blue', 'sky', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:29:15,656.656 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'wing', 'tail', 'airplane', 'yellow', 'propeller', 'blue', 'engine', 'plane', '[UNK]', 'small', 'window', 'body', 'wheel', 'aircraft', 'clear', 'air', 'cockpit', 'stripe', 'nose', 'letter', 'white', 'number', 'blade', 'bottom', 'landing', 'large', 'high', 'day', 'gear', 'bright', 'helicopter', 'light', 'front', 'single', 'writing', 'red', 'logo', 'person', 'close', 'end', 'fin', 'side', 'gray', 'black', 'antenna', 'green', 'tree', 'star', 'old']
2022-03-16 19:29:31,492.492 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['blue', 'wing', 'sky', 'yellow', 'bright', 'landing', 'plane', 'tail', 'gear', 'airplane', 'propeller']
2022-03-16 19:31:55,377.377 2829:trainer.py:487 do_train_dict(): eta: 16:27:20  iter: 31400  speed: 290.6 images/sec  total_norm: 142.8060 (146.0146)  loss: 147.9489 (146.8488)  masked_loss: 1.5580 (1.5676)  tag_loss: 146.0753 (145.2812)  time: 1.4327 (1.7620)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4276 (1.7569)  save_time: 8.8421 (19.6009)  lr: 0.000053  max mem: 26307
2022-03-16 19:31:55,738.738 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-16 19:31:55,738.738 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 136.169677734375
2022-03-16 19:31:55,738.738 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.6303590441507
2022-03-16 19:32:11,893.893 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019961439073085785
2022-03-16 19:32:11,893.893 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:32:11,893.893 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'boys', 'lying', 'down', 'with', 'his', 'two', 'stuffed', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:32:11,909.909 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['eye', 'hair', 'nose', 'ear', 'face', 'mouth', 'bear', 'head', 'teddy', 'floor', 'foot', 'girl', 'child', 'arm', 'carpet', 'sweater', 'stuffed', 'animal', 'hand', 'leg', 'teeth', 'paw', 'ground', 'boy', 'little', 'baby', 'toy', '[UNK]', 'shirt', 'forehead', 'eyebrow', 'blanket', 'young', 'finger', 'towel', 'small', 'wall', 'toe', 'bow', 'rug', 'ball', 'shoe', 'sock', 'tail', 'next', 'kid', 'tie', 'bang', 'muzzle', 'photo']
2022-03-16 19:32:27,938.938 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'hand', 'face', 'hair', 'mouth', 'floor', 'child', 'arm', 'boy', 'eye', 'foot', 'baby', 'shirt', 'teeth', 'animal', 'nose', 'ear', 'bear', 'eyebrow', 'carpet', 'teddy', 'stuffed', 'sweater', 'paw']
2022-03-16 19:34:51,242.242 2829:trainer.py:487 do_train_dict(): eta: 16:24:38  iter: 31500  speed: 291.1 images/sec  total_norm: 140.5874 (143.2908)  loss: 145.6959 (144.5144)  masked_loss: 1.5349 (1.5340)  tag_loss: 144.1470 (142.9804)  time: 1.4319 (1.7587)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4269 (1.7535)  save_time: 8.8421 (19.6009)  lr: 0.000053  max mem: 26307
2022-03-16 19:34:51,605.605 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6216216087341309
2022-03-16 19:34:51,605.605 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 134.2286834716797
2022-03-16 19:34:51,605.605 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.63051314293584
2022-03-16 19:35:07,903.903 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019935280084609985
2022-03-16 19:35:07,903.903 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:35:07,904.904 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'snow', '##board', '##er', 'sits', 'in', 'snow', 'as', 'another', 'charter', 'along', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:35:07,920.920 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'snow', 'jacket', 'man', 'ground', 'head', 'glove', 'boot', 'coat', 'hat', 'tree', 'leg', 'person', 'hand', 'arm', 'board', 'foot', 'tag', 'sky', 'face', 'patch', 'pole', 'hood', 'helmet', 'mountain', 'shoe', 'cap', 'fence', 'mouth', 'nose', 'ski', 'slope', 'woman', 'scarf', 'track', 'strap', 'hair', 'snowy', 'shadow', 'logo', 'cloud', 'hill', 'line', 'eye', 'flag', 'glasses', 'building', 'background', 'child', 'skier']
2022-03-16 19:35:23,928.928 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'ground', 'person', 'foot', 'tree', 'sky', 'leg', 'snow', 'coat', 'pole', 'jacket', 'boot', 'shoe', 'backpack', 'strap', 'buckle']
2022-03-16 19:37:47,363.363 2829:trainer.py:487 do_train_dict(): eta: 16:21:56  iter: 31600  speed: 290.7 images/sec  total_norm: 139.4256 (140.1671)  loss: 146.7512 (147.9309)  masked_loss: 1.4832 (1.5031)  tag_loss: 145.2636 (146.4278)  time: 1.4319 (1.7612)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4268 (1.7561)  save_time: 8.8421 (19.6009)  lr: 0.000052  max mem: 26307
2022-03-16 19:37:47,725.725 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 19:37:47,725.725 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 171.8945770263672
2022-03-16 19:37:47,725.725 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.62982951504199
2022-03-16 19:38:04,079.079 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019937163218855858
2022-03-16 19:38:04,079.079 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:38:04,080.080 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'table', 'with', 'plates', 'and', 'containers', 'of', 'food', '[MASK]', 'electronics', '[MASK]', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:38:04,095.095 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['plate', 'corn', 'tomato', 'table', 'food', '[UNK]', 'bowl', 'meat', 'glass', 'onion', 'handle', 'phone', 'knife', 'spoon', 'person', 'container', 'vegetable', 'bag', 'wall', 'fork', 'napkin', 'potato', 'pen', 'bottle', 'cup', 'cell', 'box', 'hand', 'tray', 'cheese', 'sausage', 'towel', 'keyboard', 'shirt', 'carrot', 'mushroom', 'pan', 'lid', 'paper', 'chair', 'pepper', 'book', 'cabinet', 'stove', 'bread', 'banana', 'foil', 'butter', 'man', 'label']
2022-03-16 19:38:20,061.061 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'book', 'cup', 'design', 'person', 'table', 'food', 'phone', 'glass', 'box', 'cell', 'ring', 'finger', 'bag', 'bowl', 'plate', 'knife', 'pen', 'glasses', 'cheese', 'keyboard', 'fork', 'corn', 'cord', 'lid', 'butter', 'laptop', 'tomato']
2022-03-16 19:40:43,802.802 2829:trainer.py:487 do_train_dict(): eta: 16:19:15  iter: 31700  speed: 290.2 images/sec  total_norm: 142.4589 (143.7450)  loss: 144.6931 (146.5456)  masked_loss: 1.5818 (1.5985)  tag_loss: 142.8194 (144.9471)  time: 1.4335 (1.7643)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4282 (1.7591)  save_time: 8.8421 (19.6009)  lr: 0.000052  max mem: 26307
2022-03-16 19:40:44,163.163 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7941176295280457
2022-03-16 19:40:44,163.163 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 138.7493438720703
2022-03-16 19:40:44,164.164 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.63806295095
2022-03-16 19:41:00,448.448 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019949138164520264
2022-03-16 19:41:00,448.448 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:41:00,449.449 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'kids', '[MASK]', 'to', 'surf', 'in', 'the', 'ocean', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:41:00,464.464 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'water', 'arm', '[UNK]', 'hand', 'head', 'wave', 'shirt', 'logo', 'face', 'girl', 'boy', 'ocean', 'board', 'surfer', 'man', 'child', 'leg', 'suit', 'design', 'foot', 'sleeve', 'top', 'young', 'short', 'ear', 'person', 'wet', 'small', 'watch', 'surf', 'woman', 'star', 'glasses', 'name', 'reflection', 'trunk', 'mouth', 'nose', 'back', 'kid', 'beach', 'strap', 'body', 'large', 'boogie', 'bracelet', 'wrist', 'little', 'big']
2022-03-16 19:41:16,439.439 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'water', 'board', 'hair', 'girl', 'design', 'arm', 'boy', 'shirt', 'ocean', 'leg', 'wave', 'logo', 'reflection']
03-16 19:42:54.325 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 19:42:54.325 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 19:42:55.514 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 19:43:40,107.107 2829:trainer.py:487 do_train_dict(): eta: 16:16:33  iter: 31800  speed: 290.4 images/sec  total_norm: 139.8025 (143.5358)  loss: 142.4393 (143.7542)  masked_loss: 1.5756 (1.6109)  tag_loss: 140.8163 (142.1433)  time: 1.4323 (1.7631)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4271 (1.7579)  save_time: 8.8421 (19.6009)  lr: 0.000052  max mem: 26307
2022-03-16 19:43:40,468.468 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.625
2022-03-16 19:43:40,468.468 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 132.703125
2022-03-16 19:43:40,468.468 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.65341087269559
2022-03-16 19:43:56,732.732 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.019941166043281555
2022-03-16 19:43:56,733.733 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:43:56,733.733 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '##walk', 'signal', 'at', 'an', '[MASK]', 'with', 'a', 'car', 'and', 'a', 'bus', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:43:56,748.748 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'window', 'light', 'pole', 'sign', 'traffic', 'street', 'person', '[UNK]', 'man', 'wall', 'letter', 'woman', 'hand', 'shirt', 'door', 'store', 'arrow', 'balcony', 'hair', 'reflection', 'railing', 'signal', 'car', 'blind', 'bag', 'jean', 'tree', 'word', 'logo', 'sky', 'banner', 'front', 'wheel', 'stop', 'jacket', 'sidewalk', 'bike', 'post', 'box', 'coat', 'back', 'camera', 'flag', 'tire', 'shadow', 'bus', 'advertisement', 'head', 'purse']
2022-03-16 19:44:12,590.590 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'building', 'street', 'light', 'woman', 'car', 'hair', 'person', 'window', 'store', 'sign', 'jean', 'shirt', 'bus', 'traffic', 'bag', 'signal', 'plate', 'wheel', 'coat', 'license', 'pole', 'jacket', 'intersection', 'bike', 'arrow', 'purse', 'reflection', 'bicycle', 'sidewalk', 'tire', 'advertisement', 'windshield']
2022-03-16 19:46:36,316.316 2829:trainer.py:487 do_train_dict(): eta: 16:13:52  iter: 31900  speed: 290.6 images/sec  total_norm: 140.7355 (143.6749)  loss: 145.0065 (148.3272)  masked_loss: 1.5698 (1.5752)  tag_loss: 143.1826 (146.7520)  time: 1.4311 (1.7621)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4263 (1.7570)  save_time: 8.8421 (19.6009)  lr: 0.000052  max mem: 26307
2022-03-16 19:46:36,676.676 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-16 19:46:36,676.676 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.28988647460938
2022-03-16 19:46:36,676.676 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.65457969903946
2022-03-16 19:46:53,204.204 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.01998344622552395
2022-03-16 19:46:53,204.204 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:46:53,205.205 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'on', 'a', 'bike', 'with', '[MASK]', 'small', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:46:53,220.220 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'road', 'grass', 'bush', 'dog', 'head', '[UNK]', 'sidewalk', 'curb', 'leg', 'building', 'trash', 'ear', 'street', 'can', 'pole', 'house', 'roof', 'tire', 'window', 'sky', 'box', 'sign', 'bin', 'table', 'door', 'bench', 'trunk', 'line', 'car', 'wheel', 'light', 'spot', 'ground', 'tail', 'wall', 'flower', 'logo', 'face', 'lid', 'mirror', 'truck', 'shirt', 'man', 'plate', 'leaf', 'windshield', 'cow', 'post', 'person']
2022-03-16 19:47:09,294.294 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'head', 'man', 'house', 'hand', 'face', 'small', 'air', 'building', 'road', 'street', 'short', 'rock', 'foot', 'window', 'step', 'tree', 'letter', 'shirt', 'dog', 'leg', 'ear', 'wheel', 'grass', 'bush', 'hat', 'bike', 'logo', 'trunk', 'fence', 'bicycle', 'shoe', 'cart', 'flip', 'sidewalk', 'tire', 'garbage', 'fender', 'stair', 'lettering', 'flop']
2022-03-16 19:49:32,579.579 2829:trainer.py:487 do_train_dict(): eta: 16:11:10  iter: 32000  speed: 290.5 images/sec  total_norm: 139.9445 (142.2715)  loss: 144.7415 (145.7677)  masked_loss: 1.4750 (1.5651)  tag_loss: 143.1267 (144.2026)  time: 1.4316 (1.7627)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4264 (1.7574)  save_time: 8.8421 (19.6009)  lr: 0.000052  max mem: 26307
2022-03-16 19:49:32,939.939 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5945945978164673
2022-03-16 19:49:32,939.939 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 164.02615356445312
2022-03-16 19:49:32,939.939 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.65701913090881
2022-03-16 19:49:49,318.318 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02000613324344158
2022-03-16 19:49:49,319.319 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:49:49,319.319 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'person', 'rides', '[MASK]', 'bicycle', ',', 'while', 'a', 'woman', 'holding', 'a', 'blend', '##er', 'on', '[MASK]', 'table', 'gasps', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:49:49,334.334 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bike', 'shirt', 'bicycle', 'man', '[UNK]', 'woman', 'tire', 'sign', 'hat', 'ground', 'sidewalk', 'wheel', 'cap', 'table', 'shoe', 'top', 'head', 'person', 'hand', 'bottle', 'jean', 'tank', 'hair', 'building', 'short', 'arm', 'wall', 'paper', 'shadow', 'base', 'cup', 'window', 'sunglasses', 'seat', 'bag', 'leg', 'handle', 'tattoo', 'dress', 'skirt', 'jug', 'pedal', 'belt', 'pitcher', 'helmet', 'glasses', 'watch', 'liquid', 'pot', 'container']
2022-03-16 19:50:05,241.241 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'top', 'woman', 'ground', 'person', 'table', 'wall', 'food', 'arm', 'window', 'sign', 'jean', 'shirt', 'dress', 'bag', 'tank', 'shadow', 'wheel', 'belt', 'hat', 'cap', 'liquid', 'bike', 'logo', 'bicycle', 'shoe', 'tattoo', 'sidewalk', 'tire', 'pedal', 'stripe', 'gasps']
2022-03-16 19:52:28,911.911 2829:trainer.py:487 do_train_dict(): eta: 16:08:28  iter: 32100  speed: 290.4 images/sec  total_norm: 139.4007 (143.9840)  loss: 145.0429 (147.1746)  masked_loss: 1.5359 (1.5990)  tag_loss: 143.7283 (145.5756)  time: 1.4315 (1.7633)  data: 0.0001 (0.0005)  to_device: 0.0052 (0.0051)  time_gpu: 1.4263 (1.7577)  save_time: 8.8421 (19.6009)  lr: 0.000052  max mem: 26307
2022-03-16 19:52:29,273.273 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.38235294818878174
2022-03-16 19:52:29,274.274 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.93539428710938
2022-03-16 19:52:29,274.274 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.66980986269364
2022-03-16 19:52:45,860.860 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02003314532339573
2022-03-16 19:52:45,860.860 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:52:45,861.861 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'family', 'standing', 'around', '[MASK]', '[MASK]', 'go', 'down', 'the', 'ski', 'slope', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:52:45,876.876 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['person', 'snow', '[UNK]', 'helmet', 'ski', 'tree', 'jacket', 'ground', 'sky', 'glove', 'man', 'mountain', 'skier', 'tent', 'boot', 'group', 'pole', 'head', 'shirt', 'coat', 'hand', 'logo', 'background', 'hat', 'letter', 'arm', 'sign', 'strap', 'backpack', 'child', 'woman', 'kid', 'slope', 'flag', 'tag', 'cloud', 'canopy', 'number', 'girl', 'banner', 'hill', 'suit', 'hood', 'boy', 'vest', 'leg', 'stripe', 'top', 'track', 'sunglasses']
2022-03-16 19:53:01,807.807 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'ground', 'person', 'child', 'boy', 'mountain', 'tree', 'letter', 'sky', 'background', 'snow', 'kid', 'coat', 'net', 'hat', 'pole', 'jacket', 'ski', 'boot', 'slope', 'helmet', 'shoe', 'glove', 'skier', 'sock']
2022-03-16 19:55:25,356.356 2829:trainer.py:487 do_train_dict(): eta: 16:05:46  iter: 32200  speed: 290.2 images/sec  total_norm: 143.2684 (145.3279)  loss: 143.8137 (144.5256)  masked_loss: 1.5552 (1.5678)  tag_loss: 142.0320 (142.9578)  time: 1.4312 (1.7644)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4260 (1.7592)  save_time: 8.8421 (19.6009)  lr: 0.000052  max mem: 26307
2022-03-16 19:55:25,719.719 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-16 19:55:25,719.719 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 117.05340576171875
2022-03-16 19:55:25,719.719 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.67699672120274
2022-03-16 19:55:42,189.189 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020033935084939003
2022-03-16 19:55:42,189.189 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:55:42,190.190 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'dog', 'running', '[MASK]', 'sand', 'with', 'a', '[MASK]', '##is', '##bee', 'in', 'its', 'mouth', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:55:42,205.205 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['dog', 'sand', 'shadow', 'nose', 'water', '[UNK]', 'eye', 'beach', 'leg', 'head', 'ear', 'tail', 'arrow', 'mouth', 'sky', 'hair', 'paw', 'ocean', 'face', 'blue', 'footprint', 'foot', 'brown', 'design', 'small', 'cloud', 'white', 'ring', 'next', 'star', 'ground', 'top', 'wave', 'sandy', 'leaf', 'writing', 'black', 'arm', 'man', 'rock', 'picture', 'mountain', 'close', 'large', 'disc', 'cute', 'couple', 'short', 'hand', 'green']
2022-03-16 19:55:58,122.122 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'water', 'mouth', 'eye', 'beach', 'sky', 'dog', 'leg', 'nose', 'ear', 'shadow', 'sand', 'arrow', 'footprint']
2022-03-16 19:58:21,748.748 2829:trainer.py:487 do_train_dict(): eta: 16:03:04  iter: 32300  speed: 290.3 images/sec  total_norm: 141.1412 (144.2714)  loss: 146.8610 (146.6246)  masked_loss: 1.4923 (1.5456)  tag_loss: 145.3705 (145.0791)  time: 1.4325 (1.7639)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4274 (1.7588)  save_time: 8.8421 (19.6009)  lr: 0.000051  max mem: 26307
2022-03-16 19:58:22,110.110 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.59375
2022-03-16 19:58:22,110.110 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 129.6722412109375
2022-03-16 19:58:22,110.110 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.6906299473327
2022-03-16 19:58:38,797.797 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020054377615451813
2022-03-16 19:58:38,798.798 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 19:58:38,798.798 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'green', 'train', 'traveling', 'down', 'rail', 'road', 'tracks', '[MASK]', 'to', 'a', 'forest', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 19:58:38,814.814 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'sky', 'track', 'train', 'window', 'grass', 'pole', 'road', 'cloud', 'front', 'wire', 'fence', 'door', 'ground', 'car', 'path', 'building', 'light', '[UNK]', 'gravel', 'green', 'sign', 'power', 'person', 'roof', 'logo', 'passenger', 'wheel', 'line', 'windshield', 'railroad', 'post', 'stripe', 'tower', 'telephone', 'bush', 'next', 'yellow', 'bumper', 'house', 'engine', 'long', 'flower', 'blue', 'top', 'trunk', 'wall', 'street', 'platform', 'hill']
2022-03-16 19:58:54,764.764 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['house', 'building', 'top', 'road', 'front', 'light', 'ground', 'track', 'green', 'forest', 'window', 'train', 'tree', 'sky', 'path', 'rail', 'grass', 'bush', 'cloud', 'pole', 'flower']
2022-03-16 20:01:18,269.269 2829:trainer.py:487 do_train_dict(): eta: 16:00:22  iter: 32400  speed: 290.1 images/sec  total_norm: 142.1349 (145.6806)  loss: 145.1860 (146.0428)  masked_loss: 1.5131 (1.5705)  tag_loss: 143.5649 (144.4722)  time: 1.4324 (1.7652)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4271 (1.7601)  save_time: 8.8421 (19.6009)  lr: 0.000051  max mem: 26307
2022-03-16 20:01:18,629.629 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-16 20:01:18,629.629 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 156.820068359375
2022-03-16 20:01:18,629.629 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.69842272244966
2022-03-16 20:01:35,402.402 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020048072561621666
2022-03-16 20:01:35,403.403 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:01:35,403.403 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'kite', '##boarding', 'over', 'the', 'ocean', '[MASK]', 'to', 'shore', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:01:35,419.419 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'water', 'cloud', 'building', '[UNK]', 'wave', 'person', 'tree', 'man', 'hill', 'shore', 'house', 'rock', 'beach', 'head', 'arm', 'mountain', 'short', 'hair', 'ocean', 'shirt', 'leg', 'sand', 'board', 'kite', 'boat', 'background', 'suit', 'boy', 'dog', 'hand', 'body', 'woman', 'hat', 'foot', 'top', 'jacket', 'bird', 'island', 'umbrella', 'city', 'tower', 'grass', 'wall', 'surfer', 'tail', 'bush', 'rope', 'shoe', 'ear']
2022-03-16 20:01:51,299.299 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'man', 'house', 'next', 'water', 'building', 'person', 'tree', 'sky', 'ocean', 'wave', 'shore', 'cloud', 'reflection', 'kite']
2022-03-16 20:04:14,808.808 2829:trainer.py:487 do_train_dict(): eta: 15:57:40  iter: 32500  speed: 290.0 images/sec  total_norm: 140.3250 (145.0723)  loss: 146.5620 (146.6340)  masked_loss: 1.5365 (1.5268)  tag_loss: 144.7990 (145.1073)  time: 1.4317 (1.7654)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4265 (1.7602)  save_time: 8.8421 (19.6009)  lr: 0.000051  max mem: 26307
2022-03-16 20:04:15,169.169 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 20:04:15,169.169 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.58038330078125
2022-03-16 20:04:15,169.169 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.70738759655163
2022-03-16 20:04:31,791.791 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02004699409008026
2022-03-16 20:04:31,792.792 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:04:31,792.792 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bearded', ',', '[MASK]', 'man', 'wearing', 'a', 'neck', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:04:31,807.807 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['eye', 'nose', 'ear', 'beard', 'hair', 'wall', 'man', 'face', 'hand', 'tie', 'mouth', 'neck', '[UNK]', 'shirt', 'collar', 'finger', 'arm', 'lip', 'mustache', 'shadow', 'button', 'logo', 'facial', 'knot', 'camera', 'chin', 'wrist', 'eyebrow', 'head', 'shoulder', 'ring', 'tattoo', 'stripe', 'sleeve', 'vest', 'strap', 'forehead', 'bow', 'design', 'dress', 'close', 'front', 'bearded', 'young', 'red', 'white', 'pocket', 'black', 'suit', 'thumb']
2022-03-16 20:04:47,722.722 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'face', 'hair', 'mouth', 'star', 'wall', 'arm', 'eye', 'neck', 'ring', 'finger', 'nose', 'ear', 'lip', 'tie', 'naked', 'flower', 'beard', 'knot']
2022-03-16 20:07:11,589.589 2829:trainer.py:487 do_train_dict(): eta: 15:54:59  iter: 32600  speed: 289.6 images/sec  total_norm: 145.5555 (148.4005)  loss: 147.1146 (147.5797)  masked_loss: 1.5570 (1.5840)  tag_loss: 145.3728 (145.9957)  time: 1.4322 (1.7678)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4271 (1.7626)  save_time: 8.8421 (19.6009)  lr: 0.000051  max mem: 26307
2022-03-16 20:07:11,950.950 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6969696879386902
2022-03-16 20:07:11,950.950 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 119.18550109863281
2022-03-16 20:07:11,950.950 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.71826982643991
2022-03-16 20:07:28,596.596 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020040763542056084
2022-03-16 20:07:28,597.597 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:07:28,597.597 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'black', 'cat', 'making', 'an', 'angry', 'face', 'while', '[MASK]', 'on', 'the', '[MASK]', 'floor', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:07:28,612.612 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cat', 'ear', 'head', 'eye', 'wall', 'nose', 'tile', '[UNK]', 'face', 'black', 'paw', 'floor', 'leg', 'tail', 'rug', 'bathroom', 'carpet', 'mat', 'sink', 'foot', 'bag', 'door', 'towel', 'handle', 'lid', 'white', 'top', 'ground', 'table', 'animal', 'cloth', 'bed', 'cord', 'body', 'tub', 'tag', 'mouth', 'paper', 'bottle', 'box', 'next', 'knob', 'collar', 'container', 'pillow', 'toy', 'bowl', 'curtain', 'shelf', 'book']
2022-03-16 20:07:44,576.576 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'face', 'black', 'ground', 'floor', 'wall', 'eye', 'shirt', 'leg', 'clothes', 'nose', 'ear', 'angry', 'cat', 'bathroom', 'tail', 'clothing', 'towel', 'tile', 'rug', 'paw']
2022-03-16 20:10:08,098.098 2829:trainer.py:487 do_train_dict(): eta: 15:52:16  iter: 32700  speed: 290.1 images/sec  total_norm: 143.1995 (146.2758)  loss: 144.7127 (146.0520)  masked_loss: 1.5765 (1.6242)  tag_loss: 142.6012 (144.4278)  time: 1.4312 (1.7651)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4259 (1.7599)  save_time: 8.8421 (19.6009)  lr: 0.000051  max mem: 26307
2022-03-16 20:10:08,459.459 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-16 20:10:08,459.459 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 155.48599243164062
2022-03-16 20:10:08,459.459 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.71352985428601
2022-03-16 20:10:25,435.435 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02006969042122364
2022-03-16 20:10:25,435.435 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:10:25,435.435 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'round', '[MASK]', 'disc', 'lying', 'unidentified', 'rippled', 'water', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:10:25,451.451 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['water', '[UNK]', 'reflection', 'tail', 'head', 'light', 'ripple', 'object', 'wing', 'leg', 'bird', 'leaf', 'river', 'neck', 'feather', 'beak', 'shadow', 'body', 'red', 'foot', 'duck', 'ground', 'ball', 'black', 'branch', 'wave', 'back', 'plant', 'pole', 'grass', 'small', 'top', 'hand', 'face', 'handle', 'base', 'stripe', 'arm', 'post', 'white', 'dog', 'dock', 'rope', 'boat', 'yellow', 'line', 'paw', 'lake', 'wet', 'next']
2022-03-16 20:10:41,367.367 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'water', 'red', 'round', 'disc', 'rippled']
03-16 20:12:55.613 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 20:12:55.613 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 20:12:56.953 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 20:13:05,050.050 2829:trainer.py:487 do_train_dict(): eta: 15:49:35  iter: 32800  speed: 289.3 images/sec  total_norm: 142.1701 (144.3814)  loss: 145.3614 (147.2254)  masked_loss: 1.4899 (1.5801)  tag_loss: 143.8517 (145.6453)  time: 1.4326 (1.7695)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4274 (1.7643)  save_time: 8.8421 (19.6009)  lr: 0.000051  max mem: 26307
2022-03-16 20:13:05,411.411 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5142857432365417
2022-03-16 20:13:05,411.411 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 170.00006103515625
2022-03-16 20:13:05,411.411 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.70257175294824
2022-03-16 20:13:22,266.266 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02009853534400463
2022-03-16 20:13:22,266.266 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:13:22,267.267 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'sitting', '[MASK]', '[MASK]', 'park', 'bench', 'next', 'to', 'a', 'reflecting', 'pool', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:13:22,282.282 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'bench', 'flower', 'woman', 'hair', 'shoe', 'purse', 'tree', 'leg', 'bag', 'glasses', 'person', 'short', '[UNK]', 'water', 'bottle', 'bush', 'hand', 'skirt', 'lady', 'grass', 'head', 'plant', 'man', 'park', 'sidewalk', 'ground', 'fern', 'step', 'platform', 'front', 'couple', 'watch', 'arm', 'seat', 'dress', 'blouse', 'girl', 'sunglasses', 'wooden', 'slab', 'next', 'trunk', 'face', 'garden', 'top', 'floor', 'can', 'leaf', 'phone']
2022-03-16 20:13:38,339.339 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'water', 'park', 'woman', 'short', 'hair', 'lady', 'plant', 'tree', 'shirt', 'leg', 'bag', 'pool', 'grass', 'bush', 'bottle', 'flower', 'bench', 'glasses', 'purse', 'skirt', 'shoe']
2022-03-16 20:16:01,951.951 2829:trainer.py:487 do_train_dict(): eta: 15:46:53  iter: 32900  speed: 289.4 images/sec  total_norm: 141.7698 (143.5504)  loss: 141.2737 (143.8509)  masked_loss: 1.5296 (1.5837)  tag_loss: 139.8023 (142.2672)  time: 1.4319 (1.7691)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4266 (1.7639)  save_time: 8.8421 (19.6009)  lr: 0.000050  max mem: 26307
2022-03-16 20:16:02,312.312 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-16 20:16:02,313.313 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 134.08148193359375
2022-03-16 20:16:02,313.313 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.70526480241256
2022-03-16 20:16:19,071.071 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020088929682970047
2022-03-16 20:16:19,071.071 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:16:19,072.072 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bunch', 'of', '[MASK]', 'are', 'sitting', '[MASK]', 'top', 'of', 'orange', '##s', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:16:19,087.087 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['banana', 'fruit', 'spot', 'stem', 'orange', 'line', 'table', 'end', 'ripe', '[UNK]', 'top', 'bowl', 'leaf', 'apple', 'dot', 'bananas', 'background', 'bunch', 'plate', 'close', 'wall', 'skin', 'bottom', 'hole', 'citrus', 'face', 'light', 'peel', 'basket', 'other', 'peeled', 'next', 'design', 'piece', 'green', 'shadow', 'pile', 'writing', 'lemon', 'large', 'yellow', 'picture', 'reflection', 'nose', 'rim', 'eye', 'object', 'full', 'blue', 'different']
2022-03-16 20:16:34,987.987 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'end', 'line', 'top', 'table', 'writing', 'spot', 'orange', 'hole', 'fruit', 'stem', 'bunch', 'dot', 'banana']
2022-03-16 20:18:58,862.862 2829:trainer.py:487 do_train_dict(): eta: 15:44:11  iter: 33000  speed: 289.4 images/sec  total_norm: 141.5494 (144.7813)  loss: 144.2122 (144.1205)  masked_loss: 1.5725 (1.5776)  tag_loss: 142.5147 (142.5429)  time: 1.4320 (1.7691)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4268 (1.7638)  save_time: 8.8421 (19.6009)  lr: 0.000050  max mem: 26307
2022-03-16 20:18:59,223.223 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6111111044883728
2022-03-16 20:18:59,223.223 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 153.92584228515625
2022-03-16 20:18:59,223.223 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.69780951324186
2022-03-16 20:19:16,162.162 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020115673542022705
2022-03-16 20:19:16,162.162 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:19:16,162.162 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'dog', 'laying', 'on', 'a', 'bed', 'next', 'to', 'a', 'cat', 'cellar', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:19:16,177.177 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['ear', 'head', 'leg', 'nose', 'eye', 'dog', 'floor', 'blanket', 'tail', 'table', 'book', 'tag', 'face', 'cat', 'collar', 'towel', 'bed', 'paw', 'cord', 'wall', 'mouth', '[UNK]', 'shelf', 'box', 'pillow', 'basket', 'stripe', 'suitcase', 'chair', 'top', 'wire', 'couch', 'carpet', 'furniture', 'cabinet', 'magazine', 'outlet', 'dvd', 'door', 'next', 'hair', 'paper', 'large', 'brown', 'bag', 'rug', 'black', 'white', 'bowl', 'small']
2022-03-16 20:19:32,106.106 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'large', 'book', 'floor', 'bed', 'table', 'wall', 'magazine', 'eye', 'box', 'dog', 'leg', 'nose', 'ear', 'cat', 'tail', 'tag', 'leaf', 'blanket', 'curtain', 'cord', 'outlet']
2022-03-16 20:21:55,771.771 2829:trainer.py:487 do_train_dict(): eta: 15:41:29  iter: 33100  speed: 289.4 images/sec  total_norm: 141.7772 (145.0206)  loss: 144.6963 (144.3815)  masked_loss: 1.5534 (1.5701)  tag_loss: 143.4065 (142.8114)  time: 1.4317 (1.7691)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4265 (1.7639)  save_time: 8.8421 (19.6009)  lr: 0.000050  max mem: 26307
2022-03-16 20:21:56,131.131 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4571428596973419
2022-03-16 20:21:56,131.131 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 163.4749298095703
2022-03-16 20:21:56,132.132 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.68700901858777
2022-03-16 20:22:13,001.001 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020109426230192184
2022-03-16 20:22:13,001.001 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:22:13,002.002 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'person', 'riding', '[MASK]', '##s', '[MASK]', 'a', 'snowy', 'slope', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:22:13,017.017 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['glove', 'snow', '[UNK]', 'pole', 'ski', 'ground', 'person', 'jacket', 'skier', 'helmet', 'head', 'hand', 'boot', 'man', 'shirt', 'face', 'slope', 'hill', 'shadow', 'mountain', 'sky', 'cloud', 'hat', 'arm', 'boy', 'tree', 'woman', 'coat', 'snowy', 'foot', 'scarf', 'poles', 'fence', 'leg', 'writing', 'track', 'field', 'hair', 'downhill', 'letter', 'vest', 'sign', 'flag', 'shoe', 'logo', 'line', 'hood', 'skiing', 'trail', 'rock']
2022-03-16 20:22:29,011.011 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'ground', 'hair', 'person', 'structure', 'sign', 'sky', 'shirt', 'snow', 'cloud', 'pole', 'jacket', 'ski', 'fence', 'boot', 'slope', 'helmet', 'glove', 'snowy', 'skier']
2022-03-16 20:24:52,826.826 2829:trainer.py:487 do_train_dict(): eta: 15:38:46  iter: 33200  speed: 289.2 images/sec  total_norm: 143.1122 (145.4719)  loss: 144.1298 (145.8069)  masked_loss: 1.5737 (1.6303)  tag_loss: 142.2640 (144.1767)  time: 1.4316 (1.7706)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4264 (1.7651)  save_time: 8.8421 (19.6009)  lr: 0.000050  max mem: 26307
2022-03-16 20:24:53,188.188 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-16 20:24:53,188.188 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 110.79651641845703
2022-03-16 20:24:53,189.189 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.70362829947257
2022-03-16 20:25:10,169.169 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020150184631347656
2022-03-16 20:25:10,169.169 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:25:10,169.169 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'putting', 'a', '[MASK]', 'cup', 'in', 'a', 'microwave', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:25:10,185.185 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'hand', 'nose', 'eye', 'wall', 'shirt', 'head', 'cup', 'face', 'ear', 'man', '[UNK]', 'strawberry', 'ceiling', 'person', 'sweater', 'cord', 'microwave', 'finger', 'door', 'light', 'handle', 'mug', 'window', 'design', 'clock', 'thumb', 'shelf', 'bowl', 'kitchen', 'mustache', 'heart', 'tile', 'cabinet', 'flower', 'picture', 'glass', 'plate', 'leaf', 'food', 'woman', 'pot', 'outlet', 'front', 'oven', 'container', 'refrigerator', 'beard', 'knob', 'eyebrow']
2022-03-16 20:25:26,134.134 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'face', 'door', 'light', 'cup', 'heart', 'hair', 'design', 'person', 'wall', 'eye', 'window', 'shirt', 'coffee', 'finger', 'nose', 'ear', 'cabinet', 'ceiling', 'thumb', 'reflection', 'sweater', 'fixture', 'strawberry', 'microwave', 'mustache']
2022-03-16 20:27:49,550.550 2829:trainer.py:487 do_train_dict(): eta: 15:36:04  iter: 33300  speed: 289.7 images/sec  total_norm: 140.5572 (143.3378)  loss: 144.4606 (145.6001)  masked_loss: 1.6290 (1.6212)  tag_loss: 143.2790 (143.9788)  time: 1.4318 (1.7672)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0050)  time_gpu: 1.4266 (1.7620)  save_time: 8.8421 (19.6009)  lr: 0.000050  max mem: 26307
2022-03-16 20:27:49,911.911 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6571428775787354
2022-03-16 20:27:49,911.911 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 132.1900634765625
2022-03-16 20:27:49,911.911 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.70051733462397
2022-03-16 20:28:07,065.065 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02015579864382744
2022-03-16 20:28:07,066.066 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:28:07,066.066 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'man', 'standing', 'outside', '[MASK]', 'business', 'using', 'cell', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:28:07,082.082 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sign', 'jacket', 'building', 'hand', 'hair', 'door', 'head', 'wall', 'window', 'man', 'bike', 'face', 'letter', 'arm', 'bicycle', '[UNK]', 'coat', 'phone', 'ear', 'store', 'mouth', 'handle', 'glasses', 'bottle', 'graffiti', 'collar', 'cell', 'writing', 'shirt', 'pole', 'glass', 'front', 'sidewalk', 'number', 'next', 'jean', 'stop', 'bag', 'old', 'motorcycle', 'chain', 'basket', 'pipe', 'street', 'camera', 'nose', 'reflection', 'black', 'vent', 'shop']
2022-03-16 20:28:23,043.043 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'building', 'door', 'business', 'hair', 'outside', 'wall', 'arm', 'phone', 'window', 'cell', 'store', 'letter', 'sign', 'ear', 'chain', 'handle', 'wheel', 'pole', 'jacket', 'bike', 'bicycle', 'tire', 'poster', 'graffiti']
2022-03-16 20:30:46,608.608 2829:trainer.py:487 do_train_dict(): eta: 15:33:22  iter: 33400  speed: 289.2 images/sec  total_norm: 141.0561 (145.0242)  loss: 144.2563 (145.0876)  masked_loss: 1.5312 (1.5587)  tag_loss: 142.6852 (143.5288)  time: 1.4335 (1.7705)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4284 (1.7653)  save_time: 8.8421 (19.6009)  lr: 0.000050  max mem: 26307
2022-03-16 20:30:46,970.970 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 20:30:46,970.970 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 126.08206939697266
2022-03-16 20:30:46,970.970 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.71401871638511
2022-03-16 20:31:04,083.083 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02022990956902504
2022-03-16 20:31:04,084.084 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:31:04,084.084 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'plan', 'in', '[MASK]', '[MASK]', 'with', 'cord', 'attached', 'and', 'stairs', 'attached', 'to', 'open', 'door', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:31:04,100.100 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['airplane', 'window', 'wing', 'engine', 'cockpit', 'ground', 'stair', 'sky', 'floor', '[UNK]', 'nose', 'wheel', 'door', 'tail', 'logo', 'building', 'airport', 'staircase', 'front', 'light', 'cone', 'line', 'plane', 'wall', 'windshield', 'person', 'cart', 'jet', 'vehicle', 'ladder', 'platform', 'step', 'tire', 'car', 'large', 'ceiling', 'terminal', 'landing', 'man', 'gear', 'truck', 'letter', 'number', 'shirt', 'walkway', 'stripe', 'tunnel', 'railing', 'ramp', 'shadow']
2022-03-16 20:31:20,036.036 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'line', 'body', 'door', 'front', 'ground', 'person', 'engine', 'airport', 'window', 'wing', 'sky', 'roof', 'nose', 'wheel', 'tail', 'ceiling', 'logo', 'ladder', 'cord', 'airplane', 'cockpit', 'propeller', 'windshield', 'stair']
2022-03-16 20:33:43,841.841 2829:trainer.py:487 do_train_dict(): eta: 15:30:40  iter: 33500  speed: 288.9 images/sec  total_norm: 141.8338 (145.9614)  loss: 143.0138 (144.8058)  masked_loss: 1.4632 (1.5366)  tag_loss: 141.7024 (143.2691)  time: 1.4325 (1.7724)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4273 (1.7671)  save_time: 8.8421 (19.6009)  lr: 0.000050  max mem: 26307
2022-03-16 20:33:44,205.205 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-16 20:33:44,205.205 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 155.57501220703125
2022-03-16 20:33:44,206.206 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.7188180628277
2022-03-16 20:34:01,404.404 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020215952768921852
2022-03-16 20:34:01,404.404 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:34:01,404.404 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'vintage', '[MASK]', 'style', 'clock', 'is', 'on', 'an', 'outdoor', 'pot', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:34:01,420.420 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'pole', 'building', 'cloud', 'grass', 'light', 'road', 'street', 'sign', 'bush', 'tree', 'window', 'car', 'truck', 'city', 'tower', 'graffiti', 'ground', '[UNK]', 'line', 'traffic', 'roof', 'bench', 'bridge', 'cloudy', 'logo', 'door', 'fence', 'sand', 'trailer', 'stop', 'windshield', 'telephone', 'person', 'bus', 'post', 'water', 'hill', 'side', 'distance', 'clock', 'beach', 'large', 'empty', 'red', 'sidewalk', 'antenna', 'cross', 'next', 'wall']
2022-03-16 20:34:17,348.348 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'hand', 'building', 'road', 'street', 'light', 'ground', 'wall', 'bridge', 'window', 'tree', 'tower', 'sky', 'roof', 'snow', 'clock', 'grass', 'bush', 'cloud', 'pole', 'bench', 'outdoor', 'barrel', 'fence', 'pot', 'ladder', 'crane', 'trash', 'graffiti']
2022-03-16 20:36:40,825.825 2829:trainer.py:487 do_train_dict(): eta: 15:27:57  iter: 33600  speed: 289.3 images/sec  total_norm: 141.5611 (145.0671)  loss: 141.5509 (143.5653)  masked_loss: 1.5555 (1.5376)  tag_loss: 139.6334 (142.0277)  time: 1.4310 (1.7699)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4258 (1.7647)  save_time: 8.8421 (19.6009)  lr: 0.000049  max mem: 26307
2022-03-16 20:36:41,187.187 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5151515007019043
2022-03-16 20:36:41,187.187 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 136.50167846679688
2022-03-16 20:36:41,187.187 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.72325416134798
2022-03-16 20:36:58,328.328 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020254503935575485
2022-03-16 20:36:58,328.328 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:36:58,328.328 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'sandwich', 'with', 'chocolate', 'spread', 'arranged', 'on', '[MASK]', 'white', 'plate', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:36:58,344.344 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['plate', 'table', 'cake', 'sandwich', 'cheese', 'chocolate', 'sauce', 'meat', 'food', 'bread', 'dessert', '[UNK]', 'handle', 'cream', 'ice', 'white', 'napkin', 'layer', 'crust', 'top', 'container', 'bowl', 'fork', 'piece', 'wall', 'design', 'background', 'steak', 'bean', 'butter', 'hand', 'egg', 'close', 'light', 'stripe', 'half', 'spoon', 'label', 'glass', 'bun', 'finger', 'shadow', 'bottle', 'stain', 'pie', 'hole', 'stem', 'object', 'eaten', 'paper']
2022-03-16 20:37:14,243.243 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['white', 'top', 'table', 'food', 'spread', 'letter', 'label', 'background', 'plate', 'shadow', 'apple', 'bread', 'stem', 'chocolate', 'logo', 'cheese', 'cake', 'sandwich', 'lid', 'sauce', 'banana', 'jar', 'dessert']
2022-03-16 20:39:38,087.087 2829:trainer.py:487 do_train_dict(): eta: 15:25:15  iter: 33700  speed: 288.8 images/sec  total_norm: 140.8120 (144.1257)  loss: 143.2858 (145.2971)  masked_loss: 1.5024 (1.5549)  tag_loss: 142.1136 (143.7421)  time: 1.4321 (1.7726)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4266 (1.7673)  save_time: 8.8421 (19.6009)  lr: 0.000049  max mem: 26307
2022-03-16 20:39:38,449.449 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-16 20:39:38,449.449 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.9434051513672
2022-03-16 20:39:38,450.450 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.7336538529255
2022-03-16 20:39:55,734.734 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020248694345355034
2022-03-16 20:39:55,734.734 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:39:55,735.735 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'small', 'child', 'is', 'sitting', 'on', 'a', 'toilet', 'with', '[MASK]', '[MASK]', 'device', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:39:55,750.750 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'toilet', 'shirt', 'hand', 'leg', 'boy', 'floor', 'wall', 'sock', 'bowl', 'shoe', 'head', 'child', 'bathroom', 'seat', 'window', '[UNK]', 'phone', 'book', 'arm', 'lid', 'short', 'face', 'tile', 'tank', 'door', 'person', 'young', 'handle', 'elbow', 'girl', 'foot', 'boot', 'reflection', 'ear', 'cell', 'curtain', 'brush', 'nose', 'black', 'remote', 'water', 'paper', 'room', 'picture', 'light', 'pipe', 'ceiling', 'photo', 'ledge']
2022-03-16 20:40:11,704.704 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'small', 'book', 'hair', 'girl', 'floor', 'child', 'seat', 'boy', 'base', 'window', 'box', 'shirt', 'leg', 'clothes', 'bowl', 'electronic', 'device', 'bathroom', 'toilet', 'tile', 'sock']
2022-03-16 20:42:35,524.524 2829:trainer.py:487 do_train_dict(): eta: 15:22:33  iter: 33800  speed: 288.6 images/sec  total_norm: 140.3011 (144.1053)  loss: 145.0025 (144.8953)  masked_loss: 1.5766 (1.5926)  tag_loss: 143.4026 (143.3026)  time: 1.4325 (1.7744)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4272 (1.7692)  save_time: 8.8421 (19.6009)  lr: 0.000049  max mem: 26307
2022-03-16 20:42:35,886.886 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5625
2022-03-16 20:42:35,886.886 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 128.20892333984375
2022-03-16 20:42:35,886.886 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.74572365684847
2022-03-16 20:42:53,181.181 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02030801586806774
2022-03-16 20:42:53,181.181 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:42:53,182.182 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'white', 'vase', 'with', 'several', '[MASK]', 'flowers', 'in', 'it', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:42:53,197.197 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'flower', 'vase', 'finger', 'shirt', 'man', 'ring', 'table', 'person', 'rose', '[UNK]', 'arm', 'napkin', 'leaf', 'wall', 'mouth', 'handle', 'neck', 'phone', 'pitcher', 'cup', 'button', 'container', 'plate', 'paper', 'cell', 'hair', 'cloth', 'base', 'face', 'cake', 'white', 'watch', 'woman', 'elbow', 'candy', 'top', 'jug', 'thumb', 'design', 'glass', 'chair', 'wrist', 'rim', 'remote', 'light', 'ear', 'pink', 'lid', 'stripe']
03-16 20:42:57.047 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 20:42:57.047 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 20:42:57.730 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}]
2022-03-16 20:43:09,179.179 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'several', 'white', 'person', 'table', 'ring', 'finger', 'handle', 'salt', 'flower', 'stem', 'elbow', 'candy', 'pepper', 'colorful', 'vase', 'jug']
2022-03-16 20:45:32,641.641 2829:trainer.py:487 do_train_dict(): eta: 15:19:50  iter: 33900  speed: 289.1 images/sec  total_norm: 141.5618 (144.2980)  loss: 143.2617 (142.4413)  masked_loss: 1.5043 (1.5261)  tag_loss: 141.7666 (140.9151)  time: 1.4326 (1.7712)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4274 (1.7661)  save_time: 8.8421 (19.6009)  lr: 0.000049  max mem: 26307
2022-03-16 20:45:33,002.002 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-16 20:45:33,002.002 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 145.77383422851562
2022-03-16 20:45:33,002.002 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.7429765028112
2022-03-16 20:45:50,244.244 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02031378448009491
2022-03-16 20:45:50,245.245 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:45:50,245.245 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'is', 'lighting', '[MASK]', 'piece', 'if', 'cake', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:45:50,260.260 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'cake', 'plate', 'table', 'finger', 'candle', 'person', 'fork', '[UNK]', 'flame', 'shadow', 'handle', 'knife', 'ring', 'white', 'wall', 'man', 'background', 'lit', 'blade', 'wrist', 'arm', 'food', 'design', 'thumb', 'light', 'shirt', 'napkin', 'photo', 'piece', 'picture', 'glass', 'woman', 'spoon', 'head', 'cloth', 'nail', 'top', 'birthday', 'logo', 'face', 'stem', 'eye', 'sleeve', 'dark', 'small', 'watch', 'front', 'stick', 'couple']
2022-03-16 20:46:06,104.104 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'person', 'table', 'arm', 'piece', 'finger', 'handle', 'plate', 'shadow', 'knife', 'flame', 'fork', 'cake', 'candle']
2022-03-16 20:48:30,199.199 2829:trainer.py:487 do_train_dict(): eta: 15:17:08  iter: 34000  speed: 288.4 images/sec  total_norm: 139.9951 (143.6482)  loss: 145.9922 (145.1682)  masked_loss: 1.5660 (1.5908)  tag_loss: 144.3510 (143.5774)  time: 1.4319 (1.7756)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4267 (1.7703)  save_time: 8.8421 (19.6009)  lr: 0.000049  max mem: 26307
2022-03-16 20:48:30,560.560 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-16 20:48:30,560.560 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.73997497558594
2022-03-16 20:48:30,560.560 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.74217898684863
2022-03-16 20:48:48,098.098 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020349852740764618
2022-03-16 20:48:48,099.099 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:48:48,099.099 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'black', 'bunk', 'bed', '[MASK]', 'a', 'room', 'with', 'the', 'name', 'palmer', 'on', '##pled', 'wall', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:48:48,115.115 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'bed', 'floor', 'window', 'ladder', 'carpet', 'bunk', 'room', 'ceiling', 'outlet', '[UNK]', 'sheet', 'pillow', 'frame', 'toy', 'shelf', 'light', 'blanket', 'bedroom', 'post', 'curtain', 'sign', 'mattress', 'decoration', 'animal', 'door', 'lamp', 'star', 'stripe', 'fan', 'leg', 'drawer', 'handle', 'small', 'switch', 'flower', 'board', 'picture', 'bar', 'rack', 'mat', 'chair', 'blind', 'mirror', 'rug', 'knob', 'head', 'blue', 'rail', 'object']
2022-03-16 20:49:04,067.067 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'name', 'room', 'black', 'light', 'board', 'floor', 'bed', 'wall', 'stand', 'window', 'ball', 'bedroom', 'fan', 'ceiling', 'shade', 'toy', 'carpet', 'ladder', 'curtain', 'shelf', 'outlet', 'stripe', 'bunk']
2022-03-16 20:51:27,588.588 2829:trainer.py:487 do_train_dict(): eta: 15:14:25  iter: 34100  speed: 288.6 images/sec  total_norm: 141.9835 (142.5334)  loss: 143.0666 (144.1223)  masked_loss: 1.4782 (1.5240)  tag_loss: 141.9223 (142.5983)  time: 1.4321 (1.7738)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4269 (1.7686)  save_time: 8.8421 (19.6009)  lr: 0.000049  max mem: 26307
2022-03-16 20:51:27,947.947 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-16 20:51:27,948.948 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 162.42034912109375
2022-03-16 20:51:27,948.948 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.73802888881393
2022-03-16 20:51:45,256.256 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02031753584742546
2022-03-16 20:51:45,256.256 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:51:45,256.256 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'beautiful', 'woman', 'holding', 'a', 'tennis', 'ra', '##c', '##quet', 'on', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:51:45,272.272 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'hair', '[UNK]', 'tennis', 'shirt', 'belt', 'ball', 'woman', 'ponytail', 'head', 'court', 'arm', 'face', 'leg', 'player', 'nose', 'handle', 'ground', 'mouth', 'eye', 'band', 'string', 'ear', 'net', 'short', 'line', 'wrist', 'waist', 'watch', 'wall', 'logo', 'pony', 'tail', 'fence', 'outfit', 'finger', 'top', 'bracelet', 'buckle', 'stripe', 'uniform', 'person', 'tape', 'female', 'ready', 'sleeve', 'shoe', 'sock', 'game', 'white']
2022-03-16 20:52:01,107.107 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'band', 'player', 'woman', 'court', 'ground', 'hair', 'mouth', 'arm', 'eye', 'beautiful', 'ball', 'ring', 'shirt', 'finger', 'nose', 'ear', 'pocket', 'handle', 'tennis', 'string', 'belt', 'net', 'waist', 'wrist', 'ponytail']
2022-03-16 20:54:25,035.035 2829:trainer.py:487 do_train_dict(): eta: 15:11:43  iter: 34200  speed: 288.5 images/sec  total_norm: 141.1525 (145.2789)  loss: 148.2432 (148.3236)  masked_loss: 1.6071 (1.6107)  tag_loss: 146.1701 (146.7129)  time: 1.4327 (1.7744)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4277 (1.7693)  save_time: 8.8421 (19.6009)  lr: 0.000049  max mem: 26307
2022-03-16 20:54:25,399.399 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-16 20:54:25,399.399 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 134.07925415039062
2022-03-16 20:54:25,400.400 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.74150257833497
2022-03-16 20:54:42,913.913 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020320162177085876
2022-03-16 20:54:42,913.913 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:54:42,913.913 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', '[MASK]', 'and', 'canvas', 'chairs', ',', 'one', 'tilted', 'forward', 'with', 'a', 'cat', 'laying', 'under', '[MASK]', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:54:42,928.928 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cat', 'chair', 'wall', 'ground', 'ear', 'floor', 'head', 'door', 'leg', 'table', 'flower', 'cushion', '[UNK]', 'tail', 'cloth', 'pillow', 'paw', 'kitten', 'blanket', 'leaf', 'nose', 'eye', 'seat', 'shadow', 'window', 'small', 'back', 'top', 'mat', 'next', 'dot', 'building', 'cord', 'curtain', 'orange', 'white', 'circle', 'wooden', 'wheel', 'room', 'line', 'front', 'wood', 'reflection', 'brown', 'sidewalk', 'face', 'patio', 'paper', 'magazine']
2022-03-16 20:54:58,833.833 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'line', 'door', 'ground', 'floor', 'table', 'wall', 'chair', 'wood', 'leg', 'cat', 'net', 'flower', 'canvas']
2022-03-16 20:57:22,671.671 2829:trainer.py:487 do_train_dict(): eta: 15:09:00  iter: 34300  speed: 288.2 images/sec  total_norm: 144.4255 (146.0896)  loss: 146.3049 (145.7596)  masked_loss: 1.5811 (1.5534)  tag_loss: 145.1051 (144.2063)  time: 1.4333 (1.7764)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4281 (1.7713)  save_time: 8.8421 (19.6009)  lr: 0.000048  max mem: 26307
2022-03-16 20:57:23,031.031 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6875
2022-03-16 20:57:23,031.031 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 172.26119995117188
2022-03-16 20:57:23,032.032 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.73583235851554
2022-03-16 20:57:40,437.437 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020339546725153923
2022-03-16 20:57:40,437.437 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 20:57:40,438.438 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'and', 'several', '[MASK]', 'standing', 'in', 'a', 'pet', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 20:57:40,453.453 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cage', 'shirt', 'man', 'glasses', 'bird', 'hair', 'shelf', '[UNK]', 'bag', 'woman', 'hand', 'container', 'person', 'box', 'crate', 'cart', 'head', 'table', 'arm', 'shoe', 'watch', 'face', 'basket', 'banana', 'bottle', 'building', 'bucket', 'bin', 'case', 'jean', 'door', 'strap', 'tray', 'sign', 'jug', 'store', 'cooler', 'lady', 'cap', 'shop', 'floor', 'rack', 'light', 'food', 'ground', 'jacket', 'hat', 'girl', 'ceiling', 'glove']
2022-03-16 20:57:56,366.366 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'several', 'woman', 'hair', 'girl', 'person', 'child', 'table', 'lady', 'watch', 'box', 'jean', 'shirt', 'shop', 'bag', 'bird', 'belt', 'glasses', 'cage', 'purse', 'pet', 'boot', 'skirt', 'ladder', 'cart', 'shelf', 'container', 'tray', 'banana', 'scissors', 'crate']
2022-03-16 21:00:20,309.309 2829:trainer.py:487 do_train_dict(): eta: 15:06:18  iter: 34400  speed: 288.2 images/sec  total_norm: 143.8343 (148.4238)  loss: 147.6006 (147.1369)  masked_loss: 1.4889 (1.5125)  tag_loss: 145.9933 (145.6244)  time: 1.4318 (1.7764)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4264 (1.7709)  save_time: 8.8421 (19.6009)  lr: 0.000048  max mem: 26307
2022-03-16 21:00:20,670.670 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-16 21:00:20,671.671 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 131.43788146972656
2022-03-16 21:00:20,671.671 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.74925847813704
2022-03-16 21:00:38,221.221 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020341960713267326
2022-03-16 21:00:38,221.221 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:00:38,221.221 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'ready', 'to', 'launch', 'a', 'colorful', 'kite', 'on', 'the', 'beach', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:00:38,236.236 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'shirt', 'hat', 'cloud', 'man', 'kite', 'building', '[UNK]', 'hand', 'head', 'arm', 'stair', 'grass', 'tree', 'railing', 'ground', 'flag', 'sunglasses', 'person', 'fence', 'short', 'boy', 'leg', 'shoe', 'wall', 'roof', 'jean', 'bag', 'bush', 'child', 'window', 'step', 'pole', 'shadow', 'sidewalk', 'bridge', 'cap', 'woman', 'ladder', 'umbrella', 'flower', 'tail', 'hair', 'post', 'sign', 'house', 'truck', 'glasses', 'car', 'park']
2022-03-16 21:00:54,082.082 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'house', 'hand', 'face', 'building', 'short', 'ground', 'post', 'arm', 'hill', 'date', 'ready', 'foot', 'beach', 'sky', 'shirt', 'leg', 'roof', 'grass', 'hat', 'cloud', 'colorful', 'kite']
2022-03-16 21:03:17,806.806 2829:trainer.py:487 do_train_dict(): eta: 15:03:35  iter: 34500  speed: 288.5 images/sec  total_norm: 143.5294 (145.6437)  loss: 143.9007 (144.6687)  masked_loss: 1.5323 (1.5347)  tag_loss: 142.4247 (143.1339)  time: 1.4313 (1.7750)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4261 (1.7698)  save_time: 8.8421 (19.6009)  lr: 0.000048  max mem: 26307
2022-03-16 21:03:18,166.166 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.53125
2022-03-16 21:03:18,166.166 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 140.76185607910156
2022-03-16 21:03:18,166.166 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.75202440250816
2022-03-16 21:03:35,964.964 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02033899910748005
2022-03-16 21:03:35,964.964 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:03:35,965.965 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'animal', '[MASK]', 'sitting', 'in', 'between', 'two', 'pillows', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:03:35,980.980 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bed', 'pillow', 'wall', '[UNK]', 'head', 'eye', 'bear', 'ear', 'teddy', 'knob', 'arm', 'blanket', 'shadow', 'drawer', 'table', 'post', 'animal', 'window', 'nightstand', 'nose', 'panel', 'top', 'shade', 'frame', 'stuffed', 'leg', 'sheet', 'lamp', 'dresser', 'face', 'bolt', 'light', 'cover', 'wood', 'bedroom', 'board', 'picture', 'wooden', 'design', 'small', 'foot', 'room', 'white', 'paw', 'toy', 'flower', 'laying', 'next', 'chair', 'screw']
2022-03-16 21:03:51,937.937 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'bed', 'wall', 'arm', 'eye', 'animal', 'ear', 'bear', 'shadow', 'blanket', 'pillow', 'lamp', 'teddy', 'stuffed', 'drawer', 'strap', 'knob']
2022-03-16 21:06:15,636.636 2829:trainer.py:487 do_train_dict(): eta: 15:00:52  iter: 34600  speed: 287.9 images/sec  total_norm: 141.7339 (143.3657)  loss: 143.4248 (144.9419)  masked_loss: 1.4986 (1.5056)  tag_loss: 141.2679 (143.4364)  time: 1.4327 (1.7783)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4276 (1.7731)  save_time: 8.8421 (19.6009)  lr: 0.000048  max mem: 26307
2022-03-16 21:06:15,998.998 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-16 21:06:15,999.999 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.89132690429688
2022-03-16 21:06:15,999.999 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.76205342097653
2022-03-16 21:06:33,543.543 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020345093682408333
2022-03-16 21:06:33,543.543 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:06:33,544.544 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'woman', ',', 'one', 'holding', 'a', 'chicken', 'and', 'one', '[MASK]', 'a', 'don', '##ut', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:06:33,559.559 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'shirt', 'hair', 'face', 'box', 'wall', 'table', 'head', 'eye', 'woman', 'girl', 'nose', '[UNK]', 'window', 'bag', 'mouth', 'couch', 'dress', 'paper', 'glasses', 'chair', 'napkin', 'arm', 'ear', 'child', 'picture', 'man', 'food', 'watch', 'bracelet', 'cup', 'pillow', 'door', 'straw', 'floor', 'finger', 'hat', 'knife', 'plant', 'necklace', 'wrist', 'bow', 'book', 'tissue', 'cabinet', 'bird', 'neck', 'glass', 'lady', 'handle']
2022-03-16 21:06:49,440.440 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'face', 'woman', 'hair', 'mouth', 'table', 'wall', 'arm', 'eye', 'chair', 'paper', 'box', 'shirt', 'dress', 'nose', 'chicken', 'curtain', 'napkin']
2022-03-16 21:09:13,324.324 2829:trainer.py:487 do_train_dict(): eta: 14:58:09  iter: 34700  speed: 288.1 images/sec  total_norm: 143.1438 (145.1093)  loss: 143.4171 (145.1008)  masked_loss: 1.6046 (1.6009)  tag_loss: 141.8913 (143.4999)  time: 1.4326 (1.7768)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4274 (1.7716)  save_time: 8.8421 (19.6009)  lr: 0.000048  max mem: 26307
2022-03-16 21:09:13,684.684 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.8333333134651184
2022-03-16 21:09:13,684.684 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.01309204101562
2022-03-16 21:09:13,685.685 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.770817603188
2022-03-16 21:09:31,507.507 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020339904353022575
2022-03-16 21:09:31,507.507 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:09:31,507.507 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'people', 'in', '[MASK]', 'coats', '[MASK]', 'doing', 'something', 'they', 'pulled', 'up', '[MASK]', 'their', 'laptop', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:09:31,523.523 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['laptop', 'keyboard', 'hand', 'hair', 'table', 'screen', 'woman', 'shirt', 'key', 'computer', 'person', '[UNK]', 'desk', 'wall', 'glasses', 'pole', 'jacket', 'chair', 'tray', 'head', 'sleeve', 'man', 'pen', 'ear', 'face', 'picture', 'handle', 'bottle', 'girl', 'sunglasses', 'container', 'cord', 'mouse', 'bag', 'food', 'cup', 'purse', 'light', 'scissors', 'boy', 'fork', 'box', 'plate', 'ring', 'glass', 'arm', 'logo', 'lamp', 'spoon', 'paper']
2022-03-16 21:09:47,513.513 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'something', 'woman', 'hair', 'person', 'table', 'wall', 'key', 'computer', 'shirt', 'screen', 'ear', 'desk', 'handle', 'coat', 'pan', 'jacket', 'lab', 'pen', 'glasses', 'logo', 'brush', 'keyboard', 'tray', 'laptop', 'sunglasses']
2022-03-16 21:12:11,473.473 2829:trainer.py:487 do_train_dict(): eta: 14:55:27  iter: 34800  speed: 287.4 images/sec  total_norm: 142.2603 (144.7059)  loss: 142.1245 (144.3255)  masked_loss: 1.5166 (1.5313)  tag_loss: 140.5276 (142.7943)  time: 1.4339 (1.7815)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4290 (1.7763)  save_time: 8.8421 (19.6009)  lr: 0.000048  max mem: 26307
2022-03-16 21:12:11,833.833 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6571428775787354
2022-03-16 21:12:11,834.834 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 140.48550415039062
2022-03-16 21:12:11,834.834 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.76653524797761
2022-03-16 21:12:29,800.800 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020334212109446526
2022-03-16 21:12:29,801.801 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:12:29,801.801 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'feelings', 'old', 'computer', '[MASK]', 'has', 'been', 'decorated', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:12:29,816.816 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['screen', 'table', 'button', 'wall', 'television', 'key', 'keyboard', 'panel', 'desk', 'reflection', 'light', '[UNK]', 'phone', 'knob', 'remote', 'drawer', 'old', 'wooden', 'cabinet', 'control', 'design', 'door', 'wire', 'speaker', 'top', 'box', 'cord', 'computer', 'next', 'dial', 'logo', 'shadow', 'monitor', 'book', 'small', 'mouse', 'colorful', 'set', 'tray', 'floor', 'room', 'paper', 'number', 'curtain', 'cell', 'handle', 'picture', 'tv', 'close', 'red']
2022-03-16 21:12:45,731.731 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['old', 'television', 'table', 'wall', 'phone', 'key', 'computer', 'screen', 'desk', 'speaker', 'button', 'keyboard', 'reflection', 'drawer', 'dial', 'knob']
03-16 21:12:57.831 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 21:12:57.831 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 21:12:59.144 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 97}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 21:15:09,378.378 2829:trainer.py:487 do_train_dict(): eta: 14:52:44  iter: 34900  speed: 287.8 images/sec  total_norm: 142.3826 (146.3888)  loss: 144.5862 (144.5441)  masked_loss: 1.4906 (1.5241)  tag_loss: 142.8866 (143.0199)  time: 1.4335 (1.7791)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4283 (1.7740)  save_time: 8.8421 (19.6009)  lr: 0.000047  max mem: 26307
2022-03-16 21:15:09,738.738 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5135135054588318
2022-03-16 21:15:09,739.739 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 123.50486755371094
2022-03-16 21:15:09,739.739 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.7805048588344
2022-03-16 21:15:27,494.494 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020334094762802124
2022-03-16 21:15:27,494.494 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:15:27,495.495 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'several', 'zebra', '[MASK]', 'are', 'standing', '[MASK]', 'to', 'the', 'camera', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:15:27,510.510 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['zebra', 'ear', 'eye', 'mane', 'head', 'nose', 'ground', 'leg', 'stripe', '[UNK]', 'mouth', 'face', 'neck', 'dirt', 'other', 'grass', 'rock', 'foot', 'close', 'next', 'spot', 'hair', 'tree', 'back', 'field', 'chin', 'plant', 'area', 'white', 'fence', 'bush', 'group', 'muzzle', 'body', 'hay', 'background', 'side', 'branch', 'wall', 'shadow', 'leaf', 'herd', 'snout', 'view', 'couple', 'road', 'picture', 'paw', 'camera', 'trunk']
2022-03-16 21:15:43,439.439 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'ground', 'mouth', 'eye', 'neck', 'foot', 'leg', 'nose', 'ear', 'camera', 'stripe', 'mane', 'zebra']
2022-03-16 21:18:07,277.277 2829:trainer.py:487 do_train_dict(): eta: 14:50:01  iter: 35000  speed: 287.8 images/sec  total_norm: 143.2993 (145.0793)  loss: 140.3727 (141.5709)  masked_loss: 1.4522 (1.4828)  tag_loss: 138.5324 (140.0880)  time: 1.4319 (1.7790)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4267 (1.7738)  save_time: 8.8421 (19.6009)  lr: 0.000047  max mem: 26307
2022-03-16 21:18:07,279.279 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt
2022-03-16 21:18:16,824.824 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5625
2022-03-16 21:18:16,824.824 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 150.883544921875
2022-03-16 21:18:16,824.824 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.78594291447914
2022-03-16 21:18:34,897.897 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02035629004240036
2022-03-16 21:18:34,897.897 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:18:34,898.898 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'that', 'is', 'on', 'her', '[MASK]', 'phone', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:18:34,913.913 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'tire', 'woman', 'face', 'leg', 'bracelet', 'hair', 'car', 'head', 'ring', 'dress', 'sunglasses', 'phone', 'glasses', 'bench', 'wheel', '[UNK]', 'shadow', 'wall', 'plant', 'finger', 'shoe', 'window', 'cell', 'street', 'nose', 'girl', 'building', 'arm', 'shirt', 'sidewalk', 'short', 'light', 'person', 'mouth', 'brick', 'ground', 'wrist', 'suv', 'foot', 'tree', 'road', 'pot', 'rim', 'bush', 'heel', 'man', 'handle', 'lady', 'weed']
2022-03-16 21:18:50,635.635 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'street', 'woman', 'car', 'ground', 'hair', 'girl', 'person', 'wall', 'phone', 'lady', 'plant', 'window', 'tree', 'cell', 'branch', 'ring', 'block', 'leg', 'dress', 'bag', 'shadow', 'wheel', 'pole', 'flower', 'bench', 'leaf', 'glasses', 'tire', 'sunglasses', 'bracelet']
2022-03-16 21:21:13,592.592 2829:trainer.py:487 do_train_dict(): eta: 14:47:26  iter: 35100  speed: 274.8 images/sec  total_norm: 143.6857 (145.3042)  loss: 144.8091 (145.2275)  masked_loss: 1.4594 (1.5247)  tag_loss: 143.5081 (143.7028)  time: 1.4334 (1.8632)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4281 (1.7663)  save_time: 8.8805 (18.1110)  lr: 0.000047  max mem: 26307
2022-03-16 21:21:13,953.953 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.40625
2022-03-16 21:21:13,953.953 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 135.30152893066406
2022-03-16 21:21:13,953.953 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.79033785516566
2022-03-16 21:21:31,900.900 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02036704309284687
2022-03-16 21:21:31,901.901 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:21:31,901.901 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'standing', '[MASK]', 'a', 'room', 'next', 'to', 'lots', 'of', '[MASK]', 'chairs', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:21:31,917.917 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tie', 'curtain', 'shirt', 'man', '[UNK]', 'chair', 'floor', 'carpet', 'light', 'shoe', 'ceiling', 'person', 'hand', 'arm', 'room', 'belt', 'hair', 'table', 'head', 'dress', 'suit', 'wall', 'leg', 'jacket', 'building', 'line', 'red', 'stage', 'sign', 'front', 'shadow', 'bow', 'reflection', 'window', 'column', 'hat', 'jean', 'woman', 'formal', 'white', 'ground', 'screen', 'ball', 'sky', 'pole', 'face', 'black', 'blue', 'large', 'letter']
2022-03-16 21:21:47,824.824 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'room', 'white', 'light', 'person', 'floor', 'arm', 'chair', 'window', 'sign', 'shirt', 'tie', 'waist', 'ceiling', 'jacket', 'carpet', 'shoe', 'curtain']
2022-03-16 21:24:11,613.613 2829:trainer.py:487 do_train_dict(): eta: 14:44:43  iter: 35200  speed: 287.6 images/sec  total_norm: 141.3476 (142.8795)  loss: 140.0146 (139.5841)  masked_loss: 1.4799 (1.5152)  tag_loss: 138.8960 (138.0689)  time: 1.4325 (1.7803)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4275 (1.7752)  save_time: 8.8805 (18.1110)  lr: 0.000047  max mem: 26307
2022-03-16 21:24:11,976.976 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4864864945411682
2022-03-16 21:24:11,976.976 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 128.02099609375
2022-03-16 21:24:11,976.976 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.81322726717076
2022-03-16 21:24:30,056.056 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020360779017210007
2022-03-16 21:24:30,056.056 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:24:30,057.057 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'view', '[MASK]', 'a', 'living', 'room', '[MASK]', 'couch', '##es', 'and', 'chairs', '[MASK]', 'on', 'a', 'carpet', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:24:30,072.072 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['curtain', 'room', 'chair', 'wall', 'ceiling', 'picture', 'light', 'window', 'lamp', 'table', 'floor', 'plant', 'shade', 'couch', 'carpet', 'television', 'living', 'blanket', 'pillow', 'screen', 'sofa', 'pot', '[UNK]', 'flower', 'vase', 'leg', 'arm', 'painting', 'armchair', 'monitor', 'cushion', 'area', 'stand', 'large', 'shelf', 'outlet', 'top', 'computer', 'door', 'blade', 'fan', 'poster', 'ottoman', 'glass', 'laptop', 'switch', 'paper', 'book', 'furniture', 'end']
2022-03-16 21:24:46,024.024 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['room', 'light', 'living', 'television', 'floor', 'table', 'wall', 'view', 'chair', 'plant', 'foot', 'window', 'picture', 'screen', 'bird', 'ceiling', 'couch', 'monitor', 'shade', 'pot', 'pillow', 'carpet', 'lamp', 'sofa', 'curtain']
2022-03-16 21:27:09,703.703 2829:trainer.py:487 do_train_dict(): eta: 14:42:00  iter: 35300  speed: 287.5 images/sec  total_norm: 142.3346 (145.9948)  loss: 142.2887 (144.3121)  masked_loss: 1.4734 (1.4894)  tag_loss: 141.1812 (142.8227)  time: 1.4339 (1.7809)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4289 (1.7756)  save_time: 8.8805 (18.1110)  lr: 0.000047  max mem: 26307
2022-03-16 21:27:10,066.066 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-16 21:27:10,067.067 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 122.63671875
2022-03-16 21:27:10,067.067 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.81355590604792
2022-03-16 21:27:28,259.259 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02044505812227726
2022-03-16 21:27:28,259.259 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:27:28,260.260 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '[MASK]', 'four', 'fruits', 'put', 'inside', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:27:28,275.275 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['fruit', 'carrot', 'leaf', 'orange', 'table', '[UNK]', 'stem', 'top', 'plant', 'food', 'flower', 'ground', 'shadow', 'onion', 'vegetable', 'apple', 'hole', 'bowl', 'banana', 'spot', 'reflection', 'object', 'bunch', 'bottom', 'mushroom', 'inside', 'rim', 'tomato', 'bag', 'surface', 'group', 'light', 'wood', 'berry', 'background', 'red', 'next', 'piece', 'scissors', 'close', 'writing', 'pot', 'nut', 'end', 'other', 'ball', 'handle', 'cup', 'branch', 'black']
2022-03-16 21:27:44,242.242 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['orange', 'fruit', 'flower', 'leaf', 'reflection', 'onion']
2022-03-16 21:30:07,841.841 2829:trainer.py:487 do_train_dict(): eta: 14:39:17  iter: 35400  speed: 287.4 images/sec  total_norm: 143.0172 (145.7120)  loss: 145.4353 (146.1105)  masked_loss: 1.5128 (1.5328)  tag_loss: 143.7887 (144.5778)  time: 1.4334 (1.7814)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4283 (1.7763)  save_time: 8.8805 (18.1110)  lr: 0.000047  max mem: 26307
2022-03-16 21:30:08,203.203 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-16 21:30:08,203.203 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 155.83151245117188
2022-03-16 21:30:08,203.203 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.80979587393747
2022-03-16 21:30:26,116.116 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020453158766031265
2022-03-16 21:30:26,117.117 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:30:26,117.117 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'kite', '##s', 'are', 'flying', 'across', 'the', '[MASK]', '[MASK]', 'winds', '##ur', '##fers', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:30:26,132.132 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'kite', 'person', '[UNK]', 'string', 'tail', 'grass', 'air', 'man', 'ground', 'cloud', 'beach', 'horizon', 'water', 'tree', 'shirt', 'hill', 'field', 'sand', 'ocean', 'group', 'flag', 'leg', 'pole', 'building', 'mountain', 'hair', 'background', 'arm', 'line', 'head', 'roof', 'jacket', 'wave', 'hand', 'parachute', 'couple', 'shadow', 'shore', 'fence', 'woman', 'leaf', 'short', 'light', 'shoe', 'house', 'car', 'sun', 'top', 'boat']
2022-03-16 21:30:42,015.015 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'man', 'air', 'water', 'building', 'person', 'distance', 'tree', 'beach', 'sky', 'ocean', 'wave', 'shore', 'cloud', 'horizon', 'kite', 'surfer']
2022-03-16 21:33:05,971.971 2829:trainer.py:487 do_train_dict(): eta: 14:36:34  iter: 35500  speed: 287.4 images/sec  total_norm: 141.0793 (142.6804)  loss: 141.5518 (143.5472)  masked_loss: 1.4241 (1.5119)  tag_loss: 140.0307 (142.0352)  time: 1.4325 (1.7813)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4272 (1.7757)  save_time: 8.8805 (18.1110)  lr: 0.000047  max mem: 26307
2022-03-16 21:33:06,334.334 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-16 21:33:06,334.334 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 154.16534423828125
2022-03-16 21:33:06,334.334 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.81255073761672
2022-03-16 21:33:24,363.363 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020433004945516586
2022-03-16 21:33:24,364.364 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:33:24,365.365 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'elephants', 'in', 'a', 'field', '##op', '[MASK]', 'tree', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:33:24,380.380 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'head', 'field', 'bush', 'ear', 'tree', 'leg', 'nose', 'mane', 'tail', 'zebra', 'face', 'neck', '[UNK]', 'mouth', 'hill', 'eye', 'stripe', 'background', 'hair', 'spot', 'sky', 'shadow', 'ground', 'body', 'horn', 'back', 'tall', 'brush', 'rock', 'photo', 'white', 'plant', 'bird', 'grassy', 'large', 'dry', 'animal', 'water', 'couple', 'next', 'black', 'snout', 'foot', 'trunk', 'baby', 'herd', 'mountain', 'group', 'cow']
2022-03-16 21:33:40,239.239 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'field', 'tree', 'ear', 'palm', 'grass', 'tail', 'leaf', 'trunk', 'elephant']
2022-03-16 21:36:04,164.164 2829:trainer.py:487 do_train_dict(): eta: 14:33:51  iter: 35600  speed: 287.3 images/sec  total_norm: 145.8818 (148.3713)  loss: 144.7203 (143.7997)  masked_loss: 1.5333 (1.5217)  tag_loss: 143.2545 (142.2780)  time: 1.4328 (1.7819)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4277 (1.7767)  save_time: 8.8805 (18.1110)  lr: 0.000046  max mem: 26307
2022-03-16 21:36:04,526.526 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.800000011920929
2022-03-16 21:36:04,526.526 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 150.14825439453125
2022-03-16 21:36:04,526.526 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.80533910799427
2022-03-16 21:36:22,790.790 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020455343648791313
2022-03-16 21:36:22,791.791 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:36:22,791.791 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '##raf', '##fe', 'out', 'in', 'the', 'wild', 'on', '[MASK]', 'sunny', 'day', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:36:22,806.806 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'tree', 'grass', 'bush', 'hill', 'head', 'field', '[UNK]', 'neck', 'leg', 'ear', 'tail', 'mane', 'horn', 'face', 'ground', 'spot', 'mouth', 'nose', 'stripe', 'body', 'eye', 'zebra', 'distance', 'mountain', 'horizon', 'cloud', 'background', 'grassy', 'large', 'plain', 'dirt', 'hair', 'tall', 'dry', 'next', 'other', 'herd', 'baby', 'elephant', 'standing', 'small', 'brush', 'trunk', 'open', 'plant', 'group', 'couple', 'shadow', 'day']
2022-03-16 21:36:38,658.658 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'day', 'body', 'field', 'ground', 'hill', 'plant', 'neck', 'sky', 'wild', 'spot', 'leg', 'ear', 'grass', 'tail', 'bush', 'leaf', 'horn', 'sunny']
2022-03-16 21:39:02,531.531 2829:trainer.py:487 do_train_dict(): eta: 14:31:08  iter: 35700  speed: 287.0 images/sec  total_norm: 144.3022 (146.4315)  loss: 139.9591 (141.2245)  masked_loss: 1.4827 (1.4689)  tag_loss: 138.5107 (139.7556)  time: 1.4334 (1.7837)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4283 (1.7785)  save_time: 8.8805 (18.1110)  lr: 0.000046  max mem: 26307
2022-03-16 21:39:02,892.892 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 21:39:02,892.892 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 117.20672607421875
2022-03-16 21:39:02,892.892 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.81341491997576
2022-03-16 21:39:20,991.991 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020464623346924782
2022-03-16 21:39:20,992.992 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:39:20,992.992 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'person', '[MASK]', 'two', 'plates', '[MASK]', 'pizza', 'and', 'some', 'glasses', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:39:21,007.007 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['glass', 'pizza', 'table', 'plate', 'wine', 'hand', 'base', 'fork', 'person', 'shrimp', 'knife', 'stem', 'ring', 'crust', 'slice', 'food', '[UNK]', 'napkin', 'finger', 'shirt', 'woman', 'handle', 'bottom', 'bottle', 'onion', 'vase', 'white', 'wall', 'chair', 'tomato', 'cup', 'dish', 'watch', 'holder', 'couple', 'bowl', 'glasses', 'water', 'top', 'red', 'cheese', 'arm', 'curtain', 'drink', 'cloth', 'necklace', 'light', 'neck', 'wrist', 'topping']
2022-03-16 21:39:36,864.864 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'woman', 'hair', 'person', 'table', 'wall', 'food', 'glass', 'ring', 'bottom', 'drink', 'wine', 'plate', 'knife', 'blade', 'fork', 'dish', 'pizza', 'pepper', 'crust']
2022-03-16 21:42:00,807.807 2829:trainer.py:487 do_train_dict(): eta: 14:28:24  iter: 35800  speed: 287.2 images/sec  total_norm: 142.2354 (146.7149)  loss: 145.7817 (146.2305)  masked_loss: 1.4705 (1.4970)  tag_loss: 144.4592 (144.7335)  time: 1.4328 (1.7828)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4276 (1.7776)  save_time: 8.8805 (18.1110)  lr: 0.000046  max mem: 26307
2022-03-16 21:42:01,167.167 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-16 21:42:01,168.168 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 129.02682495117188
2022-03-16 21:42:01,168.168 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.82260449550277
2022-03-16 21:42:19,501.501 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02045128494501114
2022-03-16 21:42:19,501.501 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:42:19,502.502 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'black', '[MASK]', '[MASK]', 'on', 'top', 'of', 'a', 'microwave', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:42:19,517.517 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['ear', 'cat', 'eye', '[UNK]', 'wall', 'microwave', 'door', 'head', 'handle', 'nose', 'window', 'face', 'panel', 'top', 'pen', 'kitchen', 'shelf', 'cup', 'cabinet', 'container', 'logo', 'glass', 'light', 'refrigerator', 'oven', 'label', 'button', 'basket', 'bag', 'black', 'clock', 'display', 'control', 'box', 'knob', 'pencil', 'cord', 'rack', 'book', 'curtain', 'paper', 'counter', 'board', 'metal', 'reflection', 'screen', 'outlet', 'pot', 'paw', 'drawer']
2022-03-16 21:42:35,437.437 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'black', 'top', 'door', 'light', 'cup', 'control', 'wall', 'eye', 'window', 'box', 'kitchen', 'nose', 'bag', 'ear', 'bowl', 'display', 'cat', 'handle', 'clock', 'cabinet', 'knife', 'panel', 'button', 'pen', 'container', 'pencil', 'microwave']
03-16 21:42:59.245 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 21:42:59.245 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 21:43:00.587 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 21:44:59,158.158 2829:trainer.py:487 do_train_dict(): eta: 14:25:41  iter: 35900  speed: 287.1 images/sec  total_norm: 144.4305 (145.1420)  loss: 142.0477 (142.3598)  masked_loss: 1.5310 (1.5445)  tag_loss: 140.3518 (140.8152)  time: 1.4334 (1.7835)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4279 (1.7783)  save_time: 8.8805 (18.1110)  lr: 0.000046  max mem: 26307
2022-03-16 21:44:59,519.519 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5161290168762207
2022-03-16 21:44:59,519.519 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 163.127197265625
2022-03-16 21:44:59,519.519 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.82308253182305
2022-03-16 21:45:18,026.026 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020482752472162247
2022-03-16 21:45:18,027.027 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:45:18,027.027 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'elephants', 'wadi', '##ng', 'through', 'a', 'lake', 'next', 'to', '[MASK]', 'jungle', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:45:18,042.042 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['water', 'elephant', 'ear', 'head', 'eye', 'tree', 'leaf', 'back', 'trunk', 'river', 'body', 'mouth', 'face', 'rock', 'bank', 'tail', 'grass', 'branch', 'shore', '[UNK]', 'bush', 'plant', 'large', 'ripple', 'hair', 'leg', 'splash', 'couple', 'top', 'mud', 'waterfall', 'standing', 'shirt', 'nose', 'stick', 'baby', 'gray', 'next', 'shallow', 'arm', 'big', 'grey', 'man', 'reflection', 'drinking', 'young', 'skin', 'group', 'other', 'muddy']
2022-03-16 21:45:34,081.081 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['back', 'head', 'next', 'water', 'river', 'lake', 'eye', 'ear', 'grass', 'leaf', 'trunk', 'jungle', 'elephant']
2022-03-16 21:47:57,667.667 2829:trainer.py:487 do_train_dict(): eta: 14:22:58  iter: 36000  speed: 286.8 images/sec  total_norm: 142.9994 (146.3162)  loss: 142.5234 (143.5166)  masked_loss: 1.5696 (1.5677)  tag_loss: 141.0210 (141.9489)  time: 1.4322 (1.7851)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4269 (1.7799)  save_time: 8.8805 (18.1110)  lr: 0.000046  max mem: 26307
2022-03-16 21:47:58,027.027 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6285714507102966
2022-03-16 21:47:58,027.027 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.08287048339844
2022-03-16 21:47:58,027.027 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.81849785590767
2022-03-16 21:48:16,291.291 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02050480805337429
2022-03-16 21:48:16,292.292 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:48:16,292.292 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'long', 'paved', 'park', 'path', 'lined', 'with', 'benches', 'that', 'are', 'filled', 'with', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:48:16,307.307 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['person', 'bench', 'tree', 'ground', 'shadow', 'park', 'man', 'building', 'snow', 'chair', 'trunk', 'sky', 'hat', 'group', 'head', '[UNK]', 'umbrella', 'shirt', 'shoe', 'pole', 'street', 'road', 'sidewalk', 'photo', 'empty', 'woman', 'white', 'jacket', 'old', 'couple', 'mountain', 'leg', 'light', 'branch', 'car', 'bag', 'many', 'bird', 'lamp', 'background', 'coat', 'wall', 'area', 'bush', 'curb', 'bunch', 'pigeon', 'large', 'roof', 'wooden']
2022-03-16 21:48:32,150.150 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['long', 'man', 'building', 'park', 'ground', 'person', 'tree', 'path', 'shadow', 'bench', 'trunk', 'sidewalk', 'paved', 'curb']
2022-03-16 21:50:55,932.932 2829:trainer.py:487 do_train_dict(): eta: 14:20:15  iter: 36100  speed: 287.2 images/sec  total_norm: 144.2137 (145.6423)  loss: 146.4961 (147.4221)  masked_loss: 1.5244 (1.5625)  tag_loss: 144.9690 (145.8596)  time: 1.4319 (1.7826)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4265 (1.7774)  save_time: 8.8805 (18.1110)  lr: 0.000046  max mem: 26307
2022-03-16 21:50:56,293.293 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.71875
2022-03-16 21:50:56,294.294 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 163.20558166503906
2022-03-16 21:50:56,294.294 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.81455545794239
2022-03-16 21:51:14,578.578 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02050723508000374
2022-03-16 21:51:14,578.578 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:51:14,578.578 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'women', 'with', 'very', 'small', 'swim', '##suit', '##s', 'pose', 'and', 'talk', 'on', 'the', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:51:14,594.594 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'woman', 'shirt', 'hair', 'head', 'girl', '[UNK]', 'face', 'chain', 'arm', 'nose', 'man', 'person', 'mouth', 'eye', 'bracelet', 'necklace', 'sunglasses', 'ear', 'dress', 'phone', 'top', 'short', 'skirt', 'ring', 'glasses', 'wall', 'sleeve', 'leg', 'hat', 'stripe', 'finger', 'cell', 'tie', 'suit', 'handle', 'boy', 'purse', 'jean', 'jacket', 'shoe', 'belt', 'watch', 'young', 'child', 'pole', 'bag', 'elbow', 'grass', 'strap']
2022-03-16 21:51:30,609.609 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'small', 'band', 'book', 'woman', 'hair', 'girl', 'mouth', 'person', 'floor', 'phone', 'eye', 'cell', 'leg', 'dress', 'nose', 'lip', 'collar', 'costume', 'boot', 'bikini', 'bandage', 'sock']
2022-03-16 21:53:54,338.338 2829:trainer.py:487 do_train_dict(): eta: 14:17:31  iter: 36200  speed: 287.0 images/sec  total_norm: 141.8922 (145.1296)  loss: 142.7069 (143.6270)  masked_loss: 1.5291 (1.5233)  tag_loss: 141.0683 (142.1037)  time: 1.4321 (1.7841)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4269 (1.7789)  save_time: 8.8805 (18.1110)  lr: 0.000045  max mem: 26307
2022-03-16 21:53:54,702.702 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-16 21:53:54,702.702 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 151.873779296875
2022-03-16 21:53:54,702.702 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.80913908356806
2022-03-16 21:54:13,164.164 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020511014387011528
2022-03-16 21:54:13,164.164 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:54:13,165.165 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'there', 'is', 'a', 'yellow', 'fire', 'hydra', '##nt', 'in', 'the', 'middle', 'of', '[MASK]', 'field', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:54:13,180.180 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'grass', 'rock', 'sky', 'ground', 'trunk', 'branch', 'park', 'boulder', '[UNK]', 'field', 'bush', 'wood', 'stone', 'bench', 'dirt', 'moss', 'sign', 'log', 'stump', 'leg', 'top', 'green', 'head', 'next', 'grassy', 'structure', 'front', 'post', 'fire', 'slab', 'ear', 'pole', 'forest', 'area', 'lush', 'flower', 'shadow', 'eye', 'letter', 'red', 'face', 'stick', 'middle', 'leaf', 'hill', 'tail', 'white', 'hand', 'roof']
2022-03-16 21:54:29,041.041 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'top', 'park', 'field', 'fire', 'ground', 'rock', 'middle', 'tree', 'branch', 'sky', 'yellow', 'chain', 'grass', 'cap', 'pole', 'trunk', 'log', 'boulder']
2022-03-16 21:56:52,809.809 2829:trainer.py:487 do_train_dict(): eta: 14:14:48  iter: 36300  speed: 286.9 images/sec  total_norm: 145.9454 (148.2863)  loss: 139.2859 (141.0663)  masked_loss: 1.5221 (1.5420)  tag_loss: 137.9266 (139.5243)  time: 1.4317 (1.7847)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4266 (1.7795)  save_time: 8.8805 (18.1110)  lr: 0.000045  max mem: 26307
2022-03-16 21:56:53,170.170 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.53125
2022-03-16 21:56:53,170.170 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 142.19607543945312
2022-03-16 21:56:53,170.170 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.82295267922538
2022-03-16 21:57:11,659.659 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020537735894322395
2022-03-16 21:57:11,659.659 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 21:57:11,660.660 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'in', 'a', 'suit', 'and', '[MASK]', 'wearing', 'a', 'hat', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 21:57:11,675.675 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'building', 'tie', 'man', 'wall', 'car', 'button', 'road', 'hat', 'head', 'street', 'jacket', 'hand', 'line', 'sunglasses', 'snow', 'beard', 'face', 'roof', 'mouth', 'coat', 'nose', 'suit', 'sky', 'sidewalk', 'chimney', 'house', 'light', 'pocket', 'shirt', 'sign', '[UNK]', 'pole', 'glasses', 'fence', 'cap', 'ear', 'tire', 'arm', 'ring', 'truck', 'shutter', 'finger', 'city', 'van', 'tree', 'vehicle', 'wire', 'suv', 'parking']
2022-03-16 21:57:27,539.539 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'house', 'hand', 'face', 'line', 'building', 'road', 'street', 'light', 'car', 'mouth', 'wall', 'van', 'window', 'sky', 'roof', 'nose', 'ear', 'snow', 'suit', 'coat', 'tie', 'tail', 'hat', 'button', 'jacket', 'glasses', 'beard', 'sunglasses', 'chimney', 'windshield']
2022-03-16 21:59:51,487.487 2829:trainer.py:487 do_train_dict(): eta: 14:12:04  iter: 36400  speed: 286.6 images/sec  total_norm: 144.1550 (145.9851)  loss: 143.1841 (144.8107)  masked_loss: 1.5844 (1.5969)  tag_loss: 141.5941 (143.2138)  time: 1.4326 (1.7868)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4273 (1.7815)  save_time: 8.8805 (18.1110)  lr: 0.000045  max mem: 26307
2022-03-16 21:59:51,846.846 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5277777910232544
2022-03-16 21:59:51,847.847 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 150.60223388671875
2022-03-16 21:59:51,847.847 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.81565057414852
2022-03-16 22:00:10,265.265 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020538603886961937
2022-03-16 22:00:10,266.266 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:00:10,266.266 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'clear', '[MASK]', 'holds', 'blue', 'and', 'white', 'flowers', 'with', '[MASK]', '##ri', '##gs', 'of', 'greene', '##ry', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:00:10,282.282 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['flower', 'wall', 'vase', 'brick', 'frame', 'leaf', 'water', 'window', '[UNK]', 'bouquet', 'base', 'glass', 'design', 'stem', 'table', 'ledge', 'star', 'wood', 'blue', 'white', 'bottom', 'building', 'mirror', 'ground', 'purple', 'decoration', 'board', 'front', 'fence', 'picture', 'knot', 'clear', 'reflection', 'handle', 'blind', 'light', 'chair', 'ball', 'ribbon', 'shadow', 'floor', 'top', 'mat', 'bottle', 'full', 'line', 'arrangement', 'door', 'post', 'colorful']
2022-03-16 22:00:26,143.143 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'water', 'white', 'blue', 'star', 'wall', 'clear', 'glass', 'frame', 'handle', 'brick', 'flower', 'leaf', 'decoration', 'knot', 'ledge', 'vase']
2022-03-16 22:02:50,001.001 2829:trainer.py:487 do_train_dict(): eta: 14:09:21  iter: 36500  speed: 286.8 images/sec  total_norm: 144.3425 (145.8945)  loss: 145.7887 (145.9358)  masked_loss: 1.5164 (1.5344)  tag_loss: 144.2627 (144.4014)  time: 1.4322 (1.7851)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4271 (1.7799)  save_time: 8.8805 (18.1110)  lr: 0.000045  max mem: 26307
2022-03-16 22:02:50,362.362 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-16 22:02:50,362.362 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 129.4102783203125
2022-03-16 22:02:50,362.362 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.83403931289423
2022-03-16 22:03:09,089.089 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02055048756301403
2022-03-16 22:03:09,089.089 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:03:09,089.089 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'there', 'is', 'a', 'cat', 'sitting', '[MASK]', 'a', 'laptop', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:03:09,105.105 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cat', 'ear', 'keyboard', 'desk', 'book', 'table', 'eye', 'computer', 'door', 'head', 'laptop', 'paper', '[UNK]', 'key', 'mouse', 'screen', 'wall', 'cabinet', 'cord', 'bowl', 'pen', 'bag', 'logo', 'monitor', 'button', 'nose', 'box', 'wire', 'top', 'shelf', 'pad', 'container', 'speaker', 'cd', 'face', 'paw', 'tail', 'front', 'cable', 'pencil', 'lid', 'magazine', 'light', 'picture', 'kitten', 'collar', 'stand', 'cup', 'pot', 'remote']
2022-03-16 22:03:25,053.053 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'book', 'door', 'star', 'table', 'wall', 'eye', 'paper', 'computer', 'screen', 'nose', 'bag', 'ear', 'desk', 'cat', 'cabinet', 'cable', 'mouse', 'monitor', 'keyboard', 'cord', 'container', 'pad', 'laptop', 'icon', 'mat', 'notebook']
2022-03-16 22:05:48,753.753 2829:trainer.py:487 do_train_dict(): eta: 14:06:37  iter: 36600  speed: 286.4 images/sec  total_norm: 143.2368 (144.7022)  loss: 145.1027 (144.2995)  masked_loss: 1.4793 (1.5321)  tag_loss: 143.4799 (142.7674)  time: 1.4329 (1.7875)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4277 (1.7820)  save_time: 8.8805 (18.1110)  lr: 0.000045  max mem: 26307
2022-03-16 22:05:49,113.113 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6285714507102966
2022-03-16 22:05:49,114.114 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.56411743164062
2022-03-16 22:05:49,114.114 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.83486683816936
2022-03-16 22:06:07,798.798 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0206337608397007
2022-03-16 22:06:07,798.798 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:06:07,798.798 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'boy', 'watches', '[MASK]', '[MASK]', 'bear', 'chew', 'on', 'a', 'bone', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:06:07,814.814 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'bear', 'shirt', 'eye', 'head', 'ear', 'nose', 'water', 'hair', 'finger', 'boy', 'arm', 'polar', 'person', 'paw', '[UNK]', 'face', 'thumb', 'toy', 'wall', 'fish', 'leg', 'blue', 'animal', 'sleeve', 'mouth', 'rock', 'woman', 'young', 'food', 'ledge', 'child', 'nail', 'bracelet', 'wrist', 'claw', 'ice', 'handle', 'ball', 'bone', 'tree', 'man', 'glasses', 'pool', 'logo', 'little', 'large', 'small', 'snout', 'neck']
2022-03-16 22:06:23,818.818 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'hand', 'face', 'water', 'hair', 'person', 'arm', 'boy', 'eye', 'shirt', 'animal', 'finger', 'nose', 'ear', 'bear', 'bone', 'thumb', 'toy', 'polar', 'chew']
2022-03-16 22:08:47,439.439 2829:trainer.py:487 do_train_dict(): eta: 14:03:53  iter: 36700  speed: 286.5 images/sec  total_norm: 142.4067 (144.1378)  loss: 145.0082 (146.6317)  masked_loss: 1.5218 (1.5158)  tag_loss: 143.8188 (145.1158)  time: 1.4328 (1.7869)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4276 (1.7817)  save_time: 8.8805 (18.1110)  lr: 0.000045  max mem: 26307
2022-03-16 22:08:47,800.800 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.48571428656578064
2022-03-16 22:08:47,800.800 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 132.3931884765625
2022-03-16 22:08:47,800.800 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.84291374165079
2022-03-16 22:09:06,399.399 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020683057606220245
2022-03-16 22:09:06,399.399 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:09:06,400.400 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'dog', 'with', 'a', 'fr', '##is', '##bee', '[MASK]', 'the', 'snow', '##ﬁ', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:09:06,415.415 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['snow', 'dog', 'ground', 'tree', 'tail', 'ear', '[UNK]', 'leg', 'head', 'nose', 'eye', 'face', 'shadow', 'mouth', 'paw', 'tag', 'collar', 'glove', 'back', 'jacket', 'hat', 'ski', 'mane', 'foot', 'person', 'tongue', 'fur', 'coat', 'track', 'fence', 'hair', 'snowy', 'line', 'bush', 'spot', 'brown', 'shoe', 'background', 'boot', 'neck', 'arm', 'pole', 'skier', 'hand', 'body', 'teeth', 'branch', 'sky', 'man', 'trunk']
2022-03-16 22:09:22,318.318 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'ground', 'eye', 'tree', 'dog', 'nose', 'ear', 'snow', 'tail', 'tag', 'fence', 'paw']
2022-03-16 22:11:46,075.075 2829:trainer.py:487 do_train_dict(): eta: 14:01:10  iter: 36800  speed: 286.6 images/sec  total_norm: 142.6517 (145.1114)  loss: 144.3600 (145.3605)  masked_loss: 1.4691 (1.5138)  tag_loss: 142.9871 (143.8468)  time: 1.4329 (1.7864)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4275 (1.7812)  save_time: 8.8805 (18.1110)  lr: 0.000045  max mem: 26307
2022-03-16 22:11:46,436.436 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5277777910232544
2022-03-16 22:11:46,436.436 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 157.39910888671875
2022-03-16 22:11:46,436.436 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.83789838878766
2022-03-16 22:12:05,043.043 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02067859284579754
2022-03-16 22:12:05,043.043 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:12:05,043.043 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'black', 'and', 'gray', 'keyboard', 'and', 'black', 'mouse', 'on', 'a', '[MASK]', '[MASK]', 'carpet', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:12:05,059.059 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['mouse', 'keyboard', 'cord', 'floor', 'ground', '[UNK]', 'button', 'computer', 'wire', 'carpet', 'key', 'pad', 'ipod', 'table', 'remote', 'strap', 'logo', 'phone', 'laptop', 'cell', 'black', 'screen', 'paper', 'controller', 'next', 'camera', 'desk', 'surface', 'plug', 'electronic', 'speaker', 'pen', 'box', 'book', 'control', 'monitor', 'light', 'white', 'reflection', 'small', 'leg', 'antenna', 'case', 'circle', 'game', 'handle', 'cable', 'other', 'tag', 'writing']
2022-03-16 22:12:20,884.884 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'black', 'ground', 'floor', 'surface', 'gray', 'button', 'wire', 'mouse', 'keyboard', 'carpet', 'decorative', 'cord', 'beige']
03-16 22:13:00.685 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 22:13:00.685 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 22:13:02.196 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 22:14:45,044.044 2829:trainer.py:487 do_train_dict(): eta: 13:58:26  iter: 36900  speed: 286.1 images/sec  total_norm: 143.6793 (145.3927)  loss: 142.1302 (143.9641)  masked_loss: 1.5465 (1.5477)  tag_loss: 140.5966 (142.4164)  time: 1.4324 (1.7896)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4270 (1.7845)  save_time: 8.8805 (18.1110)  lr: 0.000044  max mem: 26307
2022-03-16 22:14:45,405.405 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-16 22:14:45,405.405 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.97787475585938
2022-03-16 22:14:45,406.406 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.85165461978397
2022-03-16 22:15:04,161.161 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020698657259345055
2022-03-16 22:15:04,161.161 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:15:04,162.162 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'large', 'tall', '[MASK]', '[MASK]', 'on', 'a', 'road', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:15:04,177.177 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'clock', 'floor', 'picture', 'grandfather', 'frame', 'base', 'post', 'carpet', 'wood', 'wooden', 'handle', 'door', 'leg', 'stand', 'hand', 'sword', 'panel', 'pole', 'face', 'shadow', 'table', 'plate', 'rug', '[UNK]', 'gun', 'front', 'cabinet', 'room', 'old', 'top', 'book', 'number', 'painting', 'chair', 'rope', 'furniture', 'light', 'holder', 'sign', 'flower', 'mat', 'foot', 'mantle', 'paper', 'next', 'window', 'large', 'antique', 'woman']
2022-03-16 22:15:20,087.087 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['large', 'door', 'road', 'post', 'floor', 'wall', 'base', 'stand', 'gun', 'metal', 'picture', 'tall', 'frame', 'handle', 'clock', 'shadow', 'panel', 'carpet']
2022-03-16 22:17:44,077.077 2829:trainer.py:487 do_train_dict(): eta: 13:55:42  iter: 37000  speed: 286.0 images/sec  total_norm: 144.2526 (146.3389)  loss: 143.5280 (144.3948)  masked_loss: 1.4919 (1.5015)  tag_loss: 141.9779 (142.8932)  time: 1.4323 (1.7904)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4270 (1.7852)  save_time: 8.8805 (18.1110)  lr: 0.000044  max mem: 26307
2022-03-16 22:17:44,437.437 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-16 22:17:44,438.438 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 126.27552032470703
2022-03-16 22:17:44,438.438 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.8635114376757
2022-03-16 22:18:03,171.171 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020698022097349167
2022-03-16 22:18:03,171.171 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:18:03,172.172 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'four', '[MASK]', 'standing', '[MASK]', 'a', 'sidewalk', 'in', 'a', 'city', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:18:03,187.187 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', '[UNK]', 'jacket', 'pole', 'building', 'person', 'shoe', 'sidewalk', 'bag', 'woman', 'hair', 'wheel', 'street', 'ground', 'sign', 'purse', 'suit', 'window', 'bench', 'post', 'light', 'coat', 'jean', 'shirt', 'leg', 'sky', 'city', 'tree', 'cover', 'bike', 'bicycle', 'head', 'railing', 'wall', 'fence', 'lamp', 'hand', 'hat', 'clock', 'store', 'board', 'base', 'roof', 'boot', 'flag', 'face', 'car', 'road', 'pipe', 'backpack']
2022-03-16 22:18:19,046.046 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'head', 'man', 'building', 'street', 'woman', 'ground', 'board', 'hair', 'person', 'wall', 'cover', 'window', 'box', 'store', 'sign', 'jean', 'shirt', 'leg', 'bag', 'suit', 'wheel', 'coat', 'pole', 'jacket', 'bike', 'purse', 'shoe', 'sidewalk', 'railing', 'sunglasses', 'graffiti']
2022-03-16 22:20:43,231.231 2829:trainer.py:487 do_train_dict(): eta: 13:52:59  iter: 37100  speed: 285.8 images/sec  total_norm: 142.5453 (146.6679)  loss: 142.8995 (144.3784)  masked_loss: 1.4497 (1.5295)  tag_loss: 141.3751 (142.8489)  time: 1.4335 (1.7915)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4286 (1.7864)  save_time: 8.8805 (18.1110)  lr: 0.000044  max mem: 26307
2022-03-16 22:20:43,590.590 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7352941036224365
2022-03-16 22:20:43,590.590 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 152.84902954101562
2022-03-16 22:20:43,591.591 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.86224854377008
2022-03-16 22:21:02,557.557 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0207029040902853
2022-03-16 22:21:02,557.557 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:21:02,557.557 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'sheer', '##ed', 'sheep', 'hu', '[MASK]', 'in', 'a', 'group', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:21:02,573.573 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'sheep', 'fence', 'ground', 'dirt', 'head', 'gate', 'field', 'ear', 'face', 'herd', 'post', 'grass', 'leg', '[UNK]', 'leaf', 'trunk', 'pen', 'group', 'pole', 'area', 'nose', 'tail', 'animal', 'branch', 'farm', 'dog', 'net', 'stand', 'flock', 'cow', 'enclosure', 'bush', 'bunch', 'black', 'plant', 'hay', 'other', 'rock', 'mud', 'metal', 'tag', 'flower', 'horn', 'lamb', 'horse', 'next', 'patch', 'shirt', 'grazing']
2022-03-16 22:21:18,601.601 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'group', 'face', 'field', 'ground', 'post', 'tree', 'leg', 'ear', 'gate', 'pole', 'dirt', 'sheep', 'fence', 'hay', 'herd']
2022-03-16 22:23:42,118.118 2829:trainer.py:487 do_train_dict(): eta: 13:50:15  iter: 37200  speed: 286.2 images/sec  total_norm: 143.1843 (146.9494)  loss: 143.5977 (144.4710)  masked_loss: 1.4802 (1.5095)  tag_loss: 141.9933 (142.9615)  time: 1.4325 (1.7888)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4273 (1.7837)  save_time: 8.8805 (18.1110)  lr: 0.000044  max mem: 26307
2022-03-16 22:23:42,479.479 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-16 22:23:42,479.479 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 156.52777099609375
2022-03-16 22:23:42,479.479 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.85706361049621
2022-03-16 22:24:01,491.491 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020694421604275703
2022-03-16 22:24:01,491.491 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:24:01,492.492 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'dive', '[MASK]', 'to', 'catch', 'a', 'fr', '##is', '##be', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:24:01,507.507 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'grass', 'shirt', 'man', 'hair', 'head', 'hand', '[UNK]', 'arm', 'ring', 'beach', 'person', 'boy', 'leg', 'ground', 'sand', 'short', 'face', 'foot', 'water', 'circle', 'woman', 'green', 'child', 'logo', 'kite', 'field', 'board', 'disc', 'ear', 'pole', 'post', 'sleeve', 'mouth', 'air', 'bush', 'watch', 'jean', 'blue', 'dirt', 'design', 'ocean', 'shoe', 'hill', 'wrist', 'couple', 'shore', 'young', 'girl', 'top']
2022-03-16 22:24:17,388.388 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'water', 'woman', 'ground', 'hair', 'person', 'arm', 'foot', 'beach', 'ring', 'sky', 'shirt', 'leg', 'sand', 'grass']
2022-03-16 22:26:41,339.339 2829:trainer.py:487 do_train_dict(): eta: 13:47:31  iter: 37300  speed: 285.7 images/sec  total_norm: 142.1692 (145.5851)  loss: 147.5261 (147.6702)  masked_loss: 1.5053 (1.5126)  tag_loss: 145.8689 (146.1576)  time: 1.4332 (1.7922)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4281 (1.7870)  save_time: 8.8805 (18.1110)  lr: 0.000044  max mem: 26307
2022-03-16 22:26:41,700.700 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 22:26:41,701.701 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 129.5003204345703
2022-03-16 22:26:41,701.701 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.8602417012587
2022-03-16 22:27:00,623.623 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020728370174765587
2022-03-16 22:27:00,623.623 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:27:00,624.624 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'suitcase', '##s', 'are', '[MASK]', 'to', 'be', 'picked', 'up', 'at', 'the', 'counter', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:27:00,639.639 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['suitcase', 'luggage', 'floor', 'carpet', 'bag', 'tag', 'handle', 'airport', '[UNK]', 'backpack', 'jacket', 'sign', 'wheel', 'ceiling', 'light', 'shirt', 'person', 'man', 'paper', 'column', 'pillar', 'strap', 'ground', 'cart', 'wall', 'case', 'woman', 'zipper', 'poster', 'board', 'purse', 'railing', 'hair', 'claim', 'blue', 'pole', 'coat', 'building', 'display', 'logo', 'jean', 'door', 'belt', 'baggage', 'lobby', 'hand', 'bench', 'window', 'box', 'wheelchair']
2022-03-16 22:27:16,537.537 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'board', 'floor', 'wall', 'airport', 'sign', 'bag', 'counter', 'handle', 'wheel', 'ceiling', 'column', 'tag', 'jacket', 'carpet', 'poster', 'pillar', 'suitcase', 'luggage', 'zipper']
2022-03-16 22:29:40,502.502 2829:trainer.py:487 do_train_dict(): eta: 13:44:47  iter: 37400  speed: 285.8 images/sec  total_norm: 142.8483 (144.7200)  loss: 142.6678 (143.3559)  masked_loss: 1.5306 (1.5459)  tag_loss: 141.3286 (141.8100)  time: 1.4332 (1.7917)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4280 (1.7865)  save_time: 8.8805 (18.1110)  lr: 0.000044  max mem: 26307
2022-03-16 22:29:40,862.862 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-16 22:29:40,862.862 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 153.02297973632812
2022-03-16 22:29:40,862.862 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.85464694213867
2022-03-16 22:29:59,918.918 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020721744745969772
2022-03-16 22:29:59,918.918 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:29:59,919.919 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'beautiful', 'green', 'vase', 'is', 'on', 'display', '[MASK]', 'a', 'cabinet', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:29:59,934.934 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'vase', 'shadow', 'table', 'flower', 'base', 'design', 'top', 'handle', 'shelf', 'picture', 'display', 'floor', 'door', 'white', 'rim', 'neck', 'stand', 'blue', 'mirror', 'green', 'bowl', 'lid', '[UNK]', 'cloth', 'jar', 'frame', 'glass', 'reflection', 'leaf', 'light', 'pot', 'decorative', 'outlet', 'bottom', 'paper', 'small', 'sign', 'room', 'dot', 'container', 'stem', 'next', 'front', 'box', 'window', 'different', 'black', 'large', 'side']
2022-03-16 22:30:15,775.775 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['door', 'green', 'floor', 'table', 'wall', 'base', 'paper', 'beautiful', 'display', 'shadow', 'cabinet', 'mirror', 'flower', 'rim', 'vase']
2022-03-16 22:32:39,814.814 2829:trainer.py:487 do_train_dict(): eta: 13:42:04  iter: 37500  speed: 285.5 images/sec  total_norm: 145.4215 (150.6948)  loss: 145.9200 (145.5019)  masked_loss: 1.5618 (1.5809)  tag_loss: 144.4152 (143.9209)  time: 1.4334 (1.7931)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4281 (1.7879)  save_time: 8.8805 (18.1110)  lr: 0.000044  max mem: 26307
2022-03-16 22:32:40,180.180 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7575757503509521
2022-03-16 22:32:40,180.180 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 123.19025421142578
2022-03-16 22:32:40,180.180 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.85614838498704
2022-03-16 22:32:59,371.371 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02071467787027359
2022-03-16 22:32:59,371.371 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:32:59,372.372 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'pizza', '##s', 'in', 'delivery', 'boxes', 'are', 'on', 'the', 'table', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:32:59,387.387 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['pizza', 'table', 'box', 'bag', '[UNK]', 'lid', 'reflection', 'cheese', 'topping', 'bottle', 'crust', 'slice', 'glass', 'person', 'pepper', 'floor', 'hand', 'tomato', 'food', 'mushroom', 'paper', 'label', 'top', 'bowl', 'light', 'napkin', 'chair', 'counter', 'plate', 'water', 'wall', 'shrimp', 'cardboard', 'handle', 'cup', 'onion', 'next', 'different', 'container', 'meat', 'spoon', 'fork', 'shirt', 'sausage', 'phone', 'large', 'shadow', 'spot', 'soda', 'wooden']
2022-03-16 22:33:15,364.364 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'table', 'box', 'bag', 'object', 'delivery', 'cheese', 'reflection', 'pizza', 'pepper', 'lid', 'mushroom', 'crust', 'tomato', 'topping', 'pea']
2022-03-16 22:35:39,107.107 2829:trainer.py:487 do_train_dict(): eta: 13:39:20  iter: 37600  speed: 285.6 images/sec  total_norm: 142.8860 (146.0319)  loss: 142.9943 (144.7085)  masked_loss: 1.5040 (1.5274)  tag_loss: 141.8533 (143.1811)  time: 1.4324 (1.7929)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4272 (1.7877)  save_time: 8.8805 (18.1110)  lr: 0.000043  max mem: 26307
2022-03-16 22:35:39,468.468 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-16 22:35:39,468.468 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 164.0895538330078
2022-03-16 22:35:39,468.468 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.85329472670821
2022-03-16 22:35:58,510.510 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02071584016084671
2022-03-16 22:35:58,510.510 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:35:58,510.510 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'flock', 'of', 'sheer', '[MASK]', 'sheep', 'in', 'a', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:35:58,526.526 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'sheep', 'leg', 'head', 'field', 'ear', 'face', 'tail', 'fence', 'nose', 'bush', 'lamb', '[UNK]', 'grassy', 'spot', 'green', 'eye', 'mouth', 'wool', 'post', 'pasture', 'white', 'herd', 'animal', 'tree', 'lush', 'group', 'plant', 'bird', 'grazing', 'standing', 'meadow', 'couple', 'background', 'open', 'top', 'foot', 'black', 'weed', 'other', 'goat', 'road', 'neck', 'dog', 'large', 'pole', 'small', 'next', 'walking', 'horn']
2022-03-16 22:36:14,433.433 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'back', 'head', 'face', 'field', 'leg', 'ear', 'grass', 'sheep', 'fence', 'lamb', 'flock']
2022-03-16 22:38:38,502.502 2829:trainer.py:487 do_train_dict(): eta: 13:36:36  iter: 37700  speed: 285.4 images/sec  total_norm: 141.9469 (144.5396)  loss: 139.9071 (141.3757)  masked_loss: 1.4578 (1.4852)  tag_loss: 138.2680 (139.8905)  time: 1.4324 (1.7940)  data: 0.0001 (0.0005)  to_device: 0.0050 (0.0049)  time_gpu: 1.4274 (1.7886)  save_time: 8.8805 (18.1110)  lr: 0.000043  max mem: 26307
2022-03-16 22:38:38,862.862 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6969696879386902
2022-03-16 22:38:38,863.863 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 153.47189331054688
2022-03-16 22:38:38,863.863 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.84930174691337
2022-03-16 22:38:58,099.099 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020729536190629005
2022-03-16 22:38:58,099.099 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:38:58,100.100 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'baseball', '[MASK]', 'is', 'hitting', 'the', 'ball', 'with', 'his', 'bat', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:38:58,115.115 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'belt', 'glove', '[UNK]', 'shirt', 'baseball', 'head', 'player', 'stripe', 'bat', 'uniform', 'jersey', 'arm', 'helmet', 'logo', 'hat', 'hand', 'fence', 'person', 'wall', 'face', 'sleeve', 'band', 'name', 'cap', 'number', 'tree', 'ear', 'field', 'stadium', 'dirt', 'nose', 'grass', 'ball', 'sky', 'letter', 'base', 'background', 'leg', 'buckle', 'stand', 'crowd', 'sign', 'pole', 'shoe', 'writing', 'hair', 'mouth', 'beard', 'batter']
2022-03-16 22:39:14,048.048 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'name', 'player', 'person', 'wall', 'arm', 'stand', 'baseball', 'ball', 'sign', 'shirt', 'jersey', 'belt', 'cap', 'uniform', 'bat', 'logo', 'sleeve', 'helmet', 'glove', 'stripe', 'spectator']
2022-03-16 22:41:37,696.696 2829:trainer.py:487 do_train_dict(): eta: 13:33:52  iter: 37800  speed: 285.7 images/sec  total_norm: 145.2930 (146.5777)  loss: 143.3892 (144.4144)  masked_loss: 1.4745 (1.5224)  tag_loss: 142.1373 (142.8920)  time: 1.4326 (1.7920)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4275 (1.7867)  save_time: 8.8805 (18.1110)  lr: 0.000043  max mem: 26307
2022-03-16 22:41:38,059.059 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6111111044883728
2022-03-16 22:41:38,059.059 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 118.5678482055664
2022-03-16 22:41:38,060.060 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.85469861395441
2022-03-16 22:41:57,362.362 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020766209810972214
2022-03-16 22:41:57,362.362 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:41:57,363.363 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'salad', 'mixed', '##ச', '[MASK]', ',', 'bro', '##cco', '##li', ',', 'and', 'other', 'items', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:41:57,378.378 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tomato', '[UNK]', 'olive', 'pepper', 'bowl', 'salad', 'chicken', 'pasta', 'pan', 'mushroom', 'carrot', 'shrimp', 'food', 'plate', 'pot', 'bean', 'meat', 'table', 'cheese', 'vegetable', 'rice', 'handle', 'dish', 'rim', 'potato', 'corn', 'mixed', 'cherry', 'black', 'spoon', 'full', 'pea', 'stove', 'bread', 'pizza', 'lemon', 'different', 'background', 'top', 'many', 'fruit', 'stir', 'meal', 'fry', 'napkin', 'wall', 'close', 'stem', 'picture', 'red']
2022-03-16 22:42:13,315.315 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'other', 'mixed', 'bowl', 'plate', 'pan', 'olive', 'corn', 'pepper', 'bean', 'lemon', 'salad', 'shrimp', 'tomato', 'pasta', 'carrot']
03-16 22:43:02.278 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 22:43:02.278 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 22:43:03.319 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 22:44:37,040.040 2829:trainer.py:487 do_train_dict(): eta: 13:31:07  iter: 37900  speed: 285.5 images/sec  total_norm: 141.8918 (145.0829)  loss: 145.2153 (144.1030)  masked_loss: 1.4875 (1.5336)  tag_loss: 143.7569 (142.5694)  time: 1.4323 (1.7934)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4270 (1.7882)  save_time: 8.8805 (18.1110)  lr: 0.000043  max mem: 26307
2022-03-16 22:44:37,402.402 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.59375
2022-03-16 22:44:37,402.402 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 156.47735595703125
2022-03-16 22:44:37,402.402 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.85262803529439
2022-03-16 22:44:56,648.648 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020804809406399727
2022-03-16 22:44:56,649.649 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:44:56,649.649 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'walking', 'in', 'between', 'some', 'trees', 'in', 'a', 'field', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:44:56,665.665 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'building', 'window', 'rock', 'log', 'balcony', 'ground', 'tree', 'path', 'animal', 'railing', 'bush', 'field', '[UNK]', 'trunk', 'head', 'pole', 'fence', 'branch', 'wood', 'ear', 'nose', 'cow', 'roof', 'post', 'dirt', 'road', 'deer', 'front', 'leg', 'house', 'door', 'horn', 'bear', 'second', 'tall', 'neck', 'next', 'large', 'elephant', 'tail', 'boulder', 'background', 'grassy', 'wall', 'bird', 'curtain', 'zebra', 'pathway', 'stone']
2022-03-16 22:45:12,563.563 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'building', 'field', 'ground', 'rock', 'window', 'tree', 'wood', 'animal', 'path', 'leg', 'nose', 'palm', 'grass', 'trunk', 'log', 'balcony', 'zebra']
2022-03-16 22:47:36,287.287 2829:trainer.py:487 do_train_dict(): eta: 13:28:23  iter: 38000  speed: 285.6 images/sec  total_norm: 143.5322 (146.8267)  loss: 139.6204 (143.0550)  masked_loss: 1.4604 (1.5004)  tag_loss: 137.9492 (141.5546)  time: 1.4321 (1.7925)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4268 (1.7872)  save_time: 8.8805 (18.1110)  lr: 0.000043  max mem: 26307
2022-03-16 22:47:36,647.647 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7058823704719543
2022-03-16 22:47:36,648.648 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 132.0045166015625
2022-03-16 22:47:36,648.648 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.87058102865545
2022-03-16 22:47:56,062.062 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02083042450249195
2022-03-16 22:47:56,062.062 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:47:56,063.063 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'horse', 'graz', '##es', 'for', 'grass', '[MASK]', 'a', 'plain', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:47:56,078.078 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['head', 'sky', 'leg', 'bush', 'tree', 'horse', 'field', 'grass', 'tail', 'ear', 'neck', 'mane', 'cloud', 'ground', 'shadow', 'face', '[UNK]', 'nose', 'water', 'patch', 'standing', 'body', 'distance', 'eye', 'mouth', 'open', 'grazing', 'animal', 'grassy', 'hair', 'brown', 'dirt', 'large', 'building', 'white', 'area', 'next', 'house', 'pole', 'front', 'puddle', 'black', 'day', 'wild', 'middle', 'top', 'background', 'spot', 'plant', 'hill']
2022-03-16 22:48:12,054.054 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'field', 'ground', 'neck', 'tree', 'horse', 'sky', 'leg', 'ear', 'shadow', 'grass', 'tail', 'bush', 'plain', 'cloud', 'mane']
2022-03-16 22:50:35,651.651 2829:trainer.py:487 do_train_dict(): eta: 13:25:39  iter: 38100  speed: 285.5 images/sec  total_norm: 144.8026 (147.0265)  loss: 141.8485 (143.8578)  masked_loss: 1.4973 (1.5103)  tag_loss: 140.5169 (142.3475)  time: 1.4327 (1.7937)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4275 (1.7885)  save_time: 8.8805 (18.1110)  lr: 0.000043  max mem: 26307
2022-03-16 22:50:36,011.011 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4571428596973419
2022-03-16 22:50:36,012.012 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 142.73191833496094
2022-03-16 22:50:36,012.012 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.87913879674143
2022-03-16 22:50:55,511.511 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020914802327752113
2022-03-16 22:50:55,512.512 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:50:55,512.512 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'clock', 'in', 'middle', 'of', 'a', 'sculpture', '[MASK]', 'top', '[MASK]', 'building', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:50:55,527.527 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'hand', 'building', 'statue', 'clock', 'head', 'sculpture', 'face', 'wall', 'man', 'crown', 'lion', 'fence', 'hair', 'number', '[UNK]', 'leg', 'gold', 'top', 'ledge', 'horse', 'window', 'wing', 'sun', 'sword', 'large', 'railing', 'design', 'ear', 'column', 'blue', 'pillar', 'fountain', 'pole', 'roman', 'shield', 'frame', 'balcony', 'horn', 'eagle', 'person', 'decoration', 'tail', 'metal', 'side', 'background', 'shadow', 'base', 'flower', 'ornate']
2022-03-16 22:51:11,456.456 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'building', 'top', 'middle', 'wall', 'sky', 'crown', 'clock', 'brick', 'statue', 'sculpture', 'lion', 'fence']
2022-03-16 22:53:35,016.016 2829:trainer.py:487 do_train_dict(): eta: 13:22:55  iter: 38200  speed: 285.5 images/sec  total_norm: 144.0508 (145.6414)  loss: 145.1267 (146.4171)  masked_loss: 1.5148 (1.5391)  tag_loss: 143.5140 (144.8780)  time: 1.4322 (1.7936)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4271 (1.7885)  save_time: 8.8805 (18.1110)  lr: 0.000042  max mem: 26307
2022-03-16 22:53:35,375.375 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-16 22:53:35,375.375 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 143.30453491210938
2022-03-16 22:53:35,376.376 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.88548150324013
2022-03-16 22:53:54,669.669 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0208915863186121
2022-03-16 22:53:54,669.669 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:53:54,669.669 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bunch', 'of', 'guys', 'playing', '[MASK]', 'pathway', 'a', 'field', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:53:54,685.685 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'tree', 'glove', '[UNK]', 'number', 'wall', 'shadow', 'helmet', 'player', 'shoe', 'line', 'man', 'jersey', 'sky', 'field', 'sign', 'uniform', 'pole', 'hat', 'baseball', 'fence', 'head', 'person', 'bat', 'hand', 'grass', 'cap', 'building', 'net', 'window', 'dirt', 'jacket', 'leg', 'base', 'boy', 'umpire', 'cloud', 'mask', 'game', 'back', 'banner', 'ball', 'catcher', 'goal', 'background', 'girl', 'young', 'plate', 'ready', 'team']
2022-03-16 22:54:10,538.538 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'number', 'line', 'player', 'field', 'person', 'child', 'wall', 'window', 'tree', 'baseball', 'sign', 'sky', 'shirt', 'jersey', 'leg', 'shadow', 'net', 'cap', 'uniform', 'pole', 'jacket', 'dirt', 'bat', 'fence', 'collar', 'bunch', 'helmet', 'shoe', 'glove', 'stripe']
2022-03-16 22:56:34,656.656 2829:trainer.py:487 do_train_dict(): eta: 13:20:10  iter: 38300  speed: 285.0 images/sec  total_norm: 145.9730 (147.5817)  loss: 145.0234 (143.9718)  masked_loss: 1.5027 (1.5331)  tag_loss: 143.7574 (142.4387)  time: 1.4336 (1.7965)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0049)  time_gpu: 1.4286 (1.7914)  save_time: 8.8805 (18.1110)  lr: 0.000042  max mem: 26307
2022-03-16 22:56:35,017.017 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.48571428656578064
2022-03-16 22:56:35,017.017 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 134.02793884277344
2022-03-16 22:56:35,018.018 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.88664351900418
2022-03-16 22:56:54,676.676 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02089565619826317
2022-03-16 22:56:54,676.676 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:56:54,677.677 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'laying', 'in', 'bed', 'next', 'to', '[MASK]', 'dog', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:56:54,692.692 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['head', 'hand', 'face', 'dog', 'arm', 'hair', 'shirt', 'person', 'man', 'eye', 'bear', 'nose', 'ear', 'animal', 'picture', 'woman', '[UNK]', 'chair', 'cat', 'boy', 'paw', 'wall', 'mouth', 'couch', 'stuffed', 'blanket', 'teddy', 'pillow', 'finger', 'floor', 'collar', 'leg', 'mirror', 'girl', 'hat', 'glasses', 'window', 'table', 'child', 'bow', 'foot', 'photo', 'tail', 'curtain', 'book', 'neck', 'tie', 'phone', 'watch', 'baby']
2022-03-16 22:57:10,640.640 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'face', 'line', 'hair', 'mouth', 'person', 'floor', 'bed', 'arm', 'eye', 'shirt', 'dog', 'spot', 'finger', 'nose', 'ear', 'cheek', 'shadow', 'blanket', 'collar', 'eyebrow', 'beard', 'paw']
2022-03-16 22:59:34,439.439 2829:trainer.py:487 do_train_dict(): eta: 13:17:26  iter: 38400  speed: 284.8 images/sec  total_norm: 146.1122 (147.8157)  loss: 143.1011 (145.3901)  masked_loss: 1.5180 (1.5304)  tag_loss: 141.6687 (143.8597)  time: 1.4331 (1.7978)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4278 (1.7926)  save_time: 8.8805 (18.1110)  lr: 0.000042  max mem: 26307
2022-03-16 22:59:34,800.800 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5714285969734192
2022-03-16 22:59:34,800.800 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 152.5529022216797
2022-03-16 22:59:34,800.800 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.88763969718636
2022-03-16 22:59:54,392.392 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020907152444124222
2022-03-16 22:59:54,392.392 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 22:59:54,392.392 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'harbor', 'full', 'of', 'white', 'boats', '[MASK]', 'a', 'plane', 'in', 'the', 'wheeling', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 22:59:54,408.408 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'cloud', 'boat', 'tree', 'airplane', 'water', 'pole', 'building', '[UNK]', 'canopy', 'dock', 'harbor', 'tail', 'wing', 'tent', 'ground', 'lot', 'car', 'flag', 'post', 'white', 'stripe', 'roof', 'marina', 'large', 'airport', 'plane', 'day', 'cloudy', 'cover', 'background', 'lake', 'person', 'ship', 'blue', 'shore', 'engine', 'window', 'mast', 'pier', 'body', 'number', 'sign', 'reflection', 'parking', 'vehicle', 'grass', 'bridge', 'beach', 'distance']
2022-03-16 23:00:10,370.370 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['water', 'white', 'full', 'cover', 'window', 'tree', 'sky', 'boat', 'plane', 'cloud', 'harbor', 'pole', 'airplane', 'canopy', 'stripe']
2022-03-16 23:02:33,882.882 2829:trainer.py:487 do_train_dict(): eta: 13:14:42  iter: 38500  speed: 285.3 images/sec  total_norm: 144.9681 (147.2144)  loss: 143.9183 (144.7134)  masked_loss: 1.4767 (1.5306)  tag_loss: 142.4166 (143.1828)  time: 1.4326 (1.7944)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4274 (1.7894)  save_time: 8.8805 (18.1110)  lr: 0.000042  max mem: 26307
2022-03-16 23:02:34,243.243 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5625
2022-03-16 23:02:34,243.243 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 137.63296508789062
2022-03-16 23:02:34,243.243 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.88041144455035
2022-03-16 23:02:53,865.865 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020896276459097862
2022-03-16 23:02:53,865.865 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:02:53,865.865 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'oakland', 'and', 'vegetables', 'are', 'lying', 'on', 'a', 'bar', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:02:53,881.881 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'table', '[UNK]', 'leaf', 'container', 'fruit', 'banana', 'vegetable', 'apple', 'bag', 'box', 'wall', 'lamp', 'food', 'logo', 'lid', 'bottle', 'pitcher', 'sign', 'counter', 'top', 'plant', 'stem', 'vent', 'basket', 'label', 'desk', 'cabinet', 'onion', 'jug', 'pole', 'squash', 'handle', 'mirror', 'pepper', 'mango', 'tray', 'base', 'bowl', 'tomato', 'television', 'paper', 'drawer', 'building', 'bunch', 'bar', 'monitor', 'pear', 'other', 'picture']
2022-03-16 23:03:09,901.901 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'television', 'table', 'food', 'window', 'bar', 'box', 'sign', 'bag', 'camera', 'fruit', 'apple', 'leaf', 'stem', 'pitcher', 'lamp', 'cord', 'container', 'banana', 'vegetable']
2022-03-16 23:05:33,548.548 2829:trainer.py:487 do_train_dict(): eta: 13:11:57  iter: 38600  speed: 285.0 images/sec  total_norm: 144.4205 (146.9250)  loss: 139.2003 (141.5523)  masked_loss: 1.4535 (1.5045)  tag_loss: 137.8857 (140.0479)  time: 1.4333 (1.7966)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4280 (1.7914)  save_time: 8.8805 (18.1110)  lr: 0.000042  max mem: 26307
2022-03-16 23:05:33,909.909 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-16 23:05:33,909.909 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.18856811523438
2022-03-16 23:05:33,909.909 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.88768414440698
2022-03-16 23:05:53,750.750 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0209027212113142
2022-03-16 23:05:53,750.750 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:05:53,751.751 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'woman', '[MASK]', 'her', 'dog', '[MASK]', 'a', '[MASK]', 'down', 'a', 'path', 'in', 'the', 'woods', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:05:53,766.766 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'dog', 'woman', 'jean', 'arm', 'shirt', 'hand', 'tree', 'ground', 'forest', 'path', 'trail', 'tail', 'leg', 'face', 'leash', 'wood', 'bag', 'head', 'plant', 'bush', 'backpack', 'top', '[UNK]', 'shoe', 'dirt', 'necklace', 'tongue', 'nose', 'collar', 'ear', 'shadow', 'leaf', 'harness', 'neck', 'girl', 'paw', 'foot', 'eye', 'lady', 'mouth', 'tank', 'weed', 'strap', 'rock', 'watch', 'grass', 'wooded', 'sky', 'person']
2022-03-16 23:06:09,774.774 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'hand', 'face', 'woman', 'ground', 'hair', 'arm', 'forest', 'plant', 'walk', 'foot', 'tree', 'wood', 'jean', 'shirt', 'dog', 'path', 'leg', 'tongue', 'trail', 'bag', 'tail', 'bush', 'dirt', 'shoe', 'necklace', 'backpack', 'harness', 'leash']
2022-03-16 23:08:33,207.207 2829:trainer.py:487 do_train_dict(): eta: 13:09:13  iter: 38700  speed: 285.0 images/sec  total_norm: 144.2442 (146.5112)  loss: 146.3764 (144.7257)  masked_loss: 1.4183 (1.4822)  tag_loss: 145.1942 (143.2435)  time: 1.4322 (1.7966)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4270 (1.7914)  save_time: 8.8805 (18.1110)  lr: 0.000042  max mem: 26307
2022-03-16 23:08:33,567.567 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5675675868988037
2022-03-16 23:08:33,567.567 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.6092529296875
2022-03-16 23:08:33,567.567 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.89115385665107
2022-03-16 23:08:54,133.133 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020916670560836792
2022-03-16 23:08:54,133.133 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:08:54,133.133 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'gate', 'requires', 'a', 'key', 'but', 'it', 'is', 'locked', 'now', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:08:54,149.149 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'sign', 'sidewalk', 'car', 'grass', 'bush', 'fence', 'ground', 'street', '[UNK]', 'bench', 'gate', 'trunk', 'park', 'chain', 'tire', 'suv', 'road', 'light', 'window', 'building', 'wall', 'person', 'pole', 'design', 'parking', 'fire', 'flower', 'woman', 'vehicle', 'plant', 'background', 'leaf', 'wheel', 'jacket', 'man', 'post', 'jeep', 'chair', 'letter', 'truck', 'hair', 'railing', 'motorcycle', 'branch', 'lamp', 'sky', 'rack', 'leg', 'word']
2022-03-16 23:09:10,069.069 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['park', 'car', 'ground', 'wall', 'key', 'tree', 'sign', 'chain', 'gate', 'wheel', 'bush', 'lock', 'leaf', 'trunk', 'fence', 'sidewalk', 'jeep', 'suv', 'leash']
2022-03-16 23:11:34,046.046 2829:trainer.py:487 do_train_dict(): eta: 13:06:29  iter: 38800  speed: 283.1 images/sec  total_norm: 143.0366 (145.6505)  loss: 143.0662 (142.6486)  masked_loss: 1.4179 (1.4743)  tag_loss: 141.6093 (141.1743)  time: 1.4321 (1.8084)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0051)  time_gpu: 1.4267 (1.8028)  save_time: 8.8805 (18.1110)  lr: 0.000042  max mem: 26307
2022-03-16 23:11:34,408.408 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-16 23:11:34,408.408 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 124.4253158569336
2022-03-16 23:11:34,408.408 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.9113605862105
2022-03-16 23:11:54,016.016 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.020928820595145226
2022-03-16 23:11:54,017.017 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:11:54,017.017 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', 'up', '##cl', '##ose', '[MASK]', 'of', '[MASK]', 'zebra', 'accent', '##ing', 'its', 'stripes', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:11:54,032.032 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['stripe', 'zebra', 'close', '[UNK]', 'neck', 'white', 'black', 'side', 'eye', 'line', 'striped', 'other', 'large', 'spot', 'next', 'wall', 'shot', 'face', 'surface', 'red', 'blue', 'shadow', 'head', 'ear', 'front', 'round', 'open', 'brown', 'design', 'view', 'many', 'image', 'area', 'different', 'leg', 'colorful', 'light', 'long', 'pair', 'middle', 'big', 'small', 'green', 'plain', 'picture', 'number', 'row', 'strip', 'pattern', 'group']
2022-03-16 23:12:09,876.876 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'image', 'stripe', 'zebra']
03-16 23:13:03.373 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 23:13:03.373 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 23:13:04.544 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 88}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-16 23:14:33,850.850 2829:trainer.py:487 do_train_dict(): eta: 13:03:44  iter: 38900  speed: 284.8 images/sec  total_norm: 144.9280 (146.3157)  loss: 139.7748 (143.3813)  masked_loss: 1.5157 (1.5194)  tag_loss: 138.2592 (141.8619)  time: 1.4329 (1.7980)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4276 (1.7929)  save_time: 8.8805 (18.1110)  lr: 0.000041  max mem: 26307
2022-03-16 23:14:34,212.212 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.42424243688583374
2022-03-16 23:14:34,212.212 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 137.31866455078125
2022-03-16 23:14:34,212.212 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.91594096452762
2022-03-16 23:14:54,107.107 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02100761979818344
2022-03-16 23:14:54,107.107 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:14:54,108.108 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'pitcher', '[MASK]', 'just', 'finished', '[MASK]', 'a', 'ball', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:14:54,123.123 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'field', '[UNK]', 'fence', 'head', 'dirt', 'glove', 'pole', 'stripe', 'shirt', 'baseball', 'shoe', 'man', 'sign', 'leg', 'bar', 'uniform', 'jersey', 'hand', 'shadow', 'mound', 'logo', 'player', 'number', 'hat', 'ground', 'ball', 'cap', 'arm', 'tree', 'pitcher', 'post', 'belt', 'game', 'pitch', 'sleeve', 'background', 'letter', 'face', 'ear', 'sock', 'wall', 'young', 'plate', 'boy', 'line', 'back', 'nose', 'base', 'ready']
2022-03-16 23:15:10,014.014 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'player', 'field', 'ground', 'arm', 'bar', 'tree', 'ball', 'letter', 'sign', 'shirt', 'jersey', 'path', 'leg', 'shadow', 'grass', 'hat', 'uniform', 'pole', 'dirt', 'pitcher', 'fence', 'shoe', 'mound', 'necklace', 'glove', 'stripe']
2022-03-16 23:17:33,687.687 2829:trainer.py:487 do_train_dict(): eta: 13:01:00  iter: 39000  speed: 284.7 images/sec  total_norm: 144.6783 (146.4727)  loss: 141.0400 (142.9053)  masked_loss: 1.4229 (1.5149)  tag_loss: 139.7280 (141.3904)  time: 1.4330 (1.7983)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4277 (1.7931)  save_time: 8.8805 (18.1110)  lr: 0.000041  max mem: 26307
2022-03-16 23:17:34,046.046 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6969696879386902
2022-03-16 23:17:34,047.047 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 137.14022827148438
2022-03-16 23:17:34,047.047 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.92032150112455
2022-03-16 23:17:53,794.794 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021010620519518852
2022-03-16 23:17:53,794.794 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:17:53,794.794 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'the', 'air', 'doing', 'a', 'trick', 'on', 'his', '[MASK]', '##board', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:17:53,810.810 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'arm', 'man', 'tree', 'short', '[UNK]', 'shoe', 'hand', 'head', 'hair', 'ear', 'wheel', 'leg', 'bracelet', 'watch', 'wrist', 'boy', 'air', 'belt', 'band', 'face', 'board', 'young', 'trick', 'foot', 'logo', 'pocket', 'design', 'sky', 'sleeve', 'background', 'skate', 'knee', 'ground', 'hat', 'nose', 'jumping', 'sunglasses', 'small', 'mid', 'glasses', 'shadow', 'person', 'jump', 'fence', 'helmet', 'stripe', 'mouth', 'grass', 'bush']
2022-03-16 23:18:09,817.817 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'air', 'short', 'hair', 'design', 'arm', 'tree', 'watch', 'sky', 'shirt', 'leg', 'ear', 'wheel', 'wrist', 'trick', 'shoe']
2022-03-16 23:20:33,582.582 2829:trainer.py:487 do_train_dict(): eta: 12:58:15  iter: 39100  speed: 284.6 images/sec  total_norm: 148.3647 (151.0380)  loss: 143.3880 (144.0234)  masked_loss: 1.4590 (1.4742)  tag_loss: 141.8699 (142.5493)  time: 1.4337 (1.7990)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4286 (1.7938)  save_time: 8.8805 (18.1110)  lr: 0.000041  max mem: 26307
2022-03-16 23:20:33,943.943 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-16 23:20:33,944.944 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 138.30784606933594
2022-03-16 23:20:33,944.944 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.92602927344186
2022-03-16 23:20:53,834.834 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021004609763622284
2022-03-16 23:20:53,835.835 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:20:53,835.835 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'small', 'pup', '##pies', 'eating', 'dog', 'food', 'out', 'of', '[MASK]', 'large', 'bowl', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:20:53,850.850 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['head', 'face', 'dog', 'nose', 'paw', 'ear', 'letter', 'eye', 'writing', 'bucket', 'tail', 'stripe', 'food', 'bean', '[UNK]', 'leg', 'word', 'fur', 'floor', 'barrel', 'puppy', 'bowl', 'white', 'mat', 'back', 'small', 'rim', 'animal', 'cereal', 'can', 'trash', 'pot', 'object', 'plate', 'dish', 'mouth', 'container', 'table', 'mushroom', 'water', 'ground', 'wall', 'pony', 'body', 'front', 'black', 'tire', 'tile', 'lettering', 'spot']
2022-03-16 23:21:09,674.674 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'face', 'small', 'line', 'large', 'floor', 'food', 'writing', 'eye', 'letter', 'dog', 'leg', 'nose', 'ear', 'bowl', 'tail', 'barrel', 'mat', 'bucket', 'stripe', 'paw']
2022-03-16 23:23:33,800.800 2829:trainer.py:487 do_train_dict(): eta: 12:55:31  iter: 39200  speed: 284.1 images/sec  total_norm: 143.5918 (147.0592)  loss: 141.8367 (142.4500)  masked_loss: 1.5637 (1.5546)  tag_loss: 139.8335 (140.8954)  time: 1.4338 (1.8022)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4284 (1.7970)  save_time: 8.8805 (18.1110)  lr: 0.000041  max mem: 26307
2022-03-16 23:23:34,161.161 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-16 23:23:34,162.162 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 164.35208129882812
2022-03-16 23:23:34,162.162 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.92469875927796
2022-03-16 23:23:53,953.953 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021037157624959946
2022-03-16 23:23:53,953.953 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:23:53,953.953 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'is', '[MASK]', 'to', 'serve', '[MASK]', 'tennis', 'ball', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:23:53,969.969 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', '[UNK]', 'man', 'hand', 'head', 'tennis', 'nose', 'arm', 'hair', 'mouth', 'ear', 'face', 'banner', 'court', 'wall', 'logo', 'fence', 'handle', 'sign', 'eye', 'ball', 'short', 'player', 'stripe', 'cap', 'neck', 'person', 'hat', 'woman', 'necklace', 'uniform', 'net', 'ground', 'letter', 'glasses', 'jersey', 'leg', 'band', 'spectator', 'top', 'flag', 'collar', 'sunglasses', 'chair', 'sleeve', 'line', 'watch', 'grass', 'wrist', 'writing']
2022-03-16 23:24:09,872.872 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'face', 'court', 'short', 'hair', 'mouth', 'person', 'wall', 'arm', 'boy', 'base', 'eye', 'chair', 'ball', 'shirt', 'nose', 'ear', 'tennis', 'uniform', 'banner', 'shoe', 'stripe', 'sock']
2022-03-16 23:26:33,864.864 2829:trainer.py:487 do_train_dict(): eta: 12:52:46  iter: 39300  speed: 284.3 images/sec  total_norm: 145.6378 (148.6445)  loss: 145.5119 (143.9531)  masked_loss: 1.5402 (1.5516)  tag_loss: 143.6169 (142.4016)  time: 1.4322 (1.8007)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4271 (1.7956)  save_time: 8.8805 (18.1110)  lr: 0.000041  max mem: 26307
2022-03-16 23:26:34,225.225 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-16 23:26:34,225.225 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 165.55166625976562
2022-03-16 23:26:34,229.229 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.91679331009763
2022-03-16 23:26:54,038.038 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021045707166194916
2022-03-16 23:26:54,038.038 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:26:54,038.038 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'woman', 'is', 'standing', 'in', 'her', 'kitchen', '[MASK]', 'to', 'her', 'small', 'freeze', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:26:54,054.054 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'floor', 'short', 'refrigerator', 'hand', 'shirt', 'woman', 'leg', '[UNK]', 'magnet', 'hair', 'foot', 'cord', 'face', 'head', 'outlet', 'flop', 'nose', 'door', 'shoe', 'eye', 'flip', 'kitchen', 'wire', 'switch', 'lady', 'paper', 'top', 'arm', 'smile', 'fridge', 'tile', 'can', 'stripe', 'ear', 'mouth', 'cabinet', 'handle', 'tank', 'glasses', 'lid', 'logo', 'next', 'box', 'ground', 'cup', 'neck', 'man', 'carpet', 'girl']
2022-03-16 23:27:09,865.865 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'small', 'line', 'next', 'door', 'woman', 'short', 'hair', 'floor', 'wall', 'arm', 'lady', 'foot', 'shirt', 'kitchen', 'leg', 'handle', 'glasses', 'dot', 'flip', 'cord', 'outlet', 'bucket', 'tile', 'jar', 'magnet', 'refrigerator', 'stripe', 'flop']
2022-03-16 23:29:33,874.874 2829:trainer.py:487 do_train_dict(): eta: 12:50:01  iter: 39400  speed: 284.4 images/sec  total_norm: 145.0777 (148.0078)  loss: 142.2869 (141.4427)  masked_loss: 1.5194 (1.5303)  tag_loss: 140.6424 (139.9123)  time: 1.4325 (1.8001)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4273 (1.7949)  save_time: 8.8805 (18.1110)  lr: 0.000041  max mem: 26307
2022-03-16 23:29:34,235.235 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5151515007019043
2022-03-16 23:29:34,235.235 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 124.91006469726562
2022-03-16 23:29:34,236.236 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.92718570564367
2022-03-16 23:29:54,438.438 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02103949338197708
2022-03-16 23:29:54,438.438 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:29:54,439.439 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'black', 'motorcycle', 'parked', 'on', 'the', 'grass', 'next', 'to', 'some', 'si', '##los', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:29:54,454.454 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'tree', 'road', 'motorcycle', 'grass', 'wall', 'tire', 'bike', 'building', 'bush', 'wheel', '[UNK]', 'light', 'mirror', 'tower', 'seat', 'fence', 'structure', 'crane', 'cloud', 'helmet', 'street', 'pole', 'side', 'field', 'car', 'bridge', 'pipe', 'water', 'sun', 'front', 'track', 'sign', 'dirt', 'next', 'black', 'windshield', 'window', 'city', 'shadow', 'bag', 'background', 'ground', 'large', 'couple', 'curb', 'post', 'sunset', 'leaf', 'exhaust']
2022-03-16 23:30:10,526.526 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'black', 'building', 'road', 'tree', 'tower', 'sky', 'shadow', 'wheel', 'grass', 'bush', 'bike', 'motorcycle', 'ladder', 'crane', 'windshield']
2022-03-16 23:32:34,144.144 2829:trainer.py:487 do_train_dict(): eta: 12:47:16  iter: 39500  speed: 284.0 images/sec  total_norm: 145.1894 (146.7714)  loss: 145.5321 (145.5164)  masked_loss: 1.4324 (1.5180)  tag_loss: 144.6010 (143.9984)  time: 1.4325 (1.8027)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4273 (1.7975)  save_time: 8.8805 (18.1110)  lr: 0.000041  max mem: 26307
2022-03-16 23:32:34,505.505 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6486486196517944
2022-03-16 23:32:34,505.505 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 152.7017364501953
2022-03-16 23:32:34,505.505 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.92869320301095
2022-03-16 23:32:54,638.638 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021076921373605728
2022-03-16 23:32:54,638.638 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:32:54,638.638 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'train', 'engine', 'with', 'train', '[MASK]', 'behind', 'it', '[MASK]', 'riding', 'on', 'a', 'set', 'of', '[MASK]', 'with', 'smoke', 'blowing', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:32:54,654.654 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['train', 'smoke', 'car', 'grass', 'window', 'track', 'steam', 'engine', 'wheel', 'gravel', 'tree', 'stream', 'water', 'rock', 'number', 'writing', 'wall', 'bush', 'roof', 'man', '[UNK]', 'red', 'door', 'hill', 'stripe', 'pole', 'top', 'sign', 'fence', 'black', 'person', 'building', 'tank', 'light', 'trunk', 'line', 'conductor', 'logo', 'road', 'toy', 'flower', 'bumper', 'model', 'ladder', 'hillside', 'plant', 'shirt', 'container', 'blue', 'passing']
2022-03-16 23:33:10,576.576 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['number', 'set', 'light', 'car', 'track', 'person', 'wall', 'engine', 'window', 'train', 'roof', 'wheel', 'stream', 'steam', 'grass', 'smoke', 'bush', 'logo', 'fence']
2022-03-16 23:35:34,253.253 2829:trainer.py:487 do_train_dict(): eta: 12:44:31  iter: 39600  speed: 284.3 images/sec  total_norm: 144.0294 (146.7562)  loss: 142.3820 (143.5742)  masked_loss: 1.5020 (1.5617)  tag_loss: 140.6226 (142.0125)  time: 1.4325 (1.8011)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4274 (1.7960)  save_time: 8.8805 (18.1110)  lr: 0.000040  max mem: 26307
2022-03-16 23:35:34,614.614 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5142857432365417
2022-03-16 23:35:34,614.614 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 135.71539306640625
2022-03-16 23:35:34,615.615 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.93804578997326
2022-03-16 23:35:54,809.809 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02107016183435917
2022-03-16 23:35:54,809.809 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:35:54,809.809 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'banana', 'and', 'two', '[MASK]', 'fashioned', 'to', 'resemble', 'a', 'smiling', 'face', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:35:54,824.824 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['apple', 'stem', 'table', 'banana', 'shadow', 'fruit', 'spot', 'green', 'top', 'wooden', 'end', 'light', '[UNK]', 'face', 'reflection', 'ripe', 'white', 'orange', 'bunch', 'next', 'smiley', 'red', 'tomato', 'surface', 'close', 'line', 'small', 'counter', 'bowl', 'board', 'black', 'design', 'brown', 'different', 'eye', 'handle', 'half', 'knot', 'full', 'other', 'paper', 'cut', 'sit', 'wood', 'plate', 'many', 'picture', 'group', 'large', 'single']
2022-03-16 23:36:10,774.774 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['face', 'top', 'table', 'shadow', 'smiling', 'fruit', 'apple', 'stem', 'banana']
2022-03-16 23:38:34,642.642 2829:trainer.py:487 do_train_dict(): eta: 12:41:47  iter: 39700  speed: 283.8 images/sec  total_norm: 144.1105 (146.7088)  loss: 141.6339 (144.2381)  masked_loss: 1.5692 (1.5496)  tag_loss: 139.7788 (142.6884)  time: 1.4346 (1.8039)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4294 (1.7988)  save_time: 8.8805 (18.1110)  lr: 0.000040  max mem: 26307
2022-03-16 23:38:35,002.002 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5142857432365417
2022-03-16 23:38:35,002.002 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 159.22828674316406
2022-03-16 23:38:35,002.002 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.94082688566428
2022-03-16 23:38:55,178.178 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021076759323477745
2022-03-16 23:38:55,178.178 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:38:55,179.179 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'on', 'top', 'of', 'a', 'beach', 'under', 'a', '[MASK]', 'sky', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:38:55,194.194 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'kite', 'beach', 'man', 'sand', 'ocean', 'water', 'wave', 'cloud', 'person', '[UNK]', 'head', 'shirt', 'jacket', 'horizon', 'leg', 'short', 'shore', 'footprint', 'couple', 'jean', 'board', 'coat', 'bag', 'string', 'arm', 'foot', 'sandy', 'dog', 'tail', 'hair', 'hat', 'cloudy', 'track', 'hill', 'day', 'surf', 'surfer', 'shoe', 'rock', 'boy', 'mountain', 'backpack', 'wind', 'para', 'ground', 'hand', 'group', 'child', 'stick']
2022-03-16 23:39:11,149.149 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'water', 'top', 'short', 'person', 'beach', 'sky', 'shirt', 'ocean', 'leg', 'wave', 'sand', 'cloud', 'jacket', 'horizon', 'glove', 'kite', 'cloudy']
2022-03-16 23:41:34,749.749 2829:trainer.py:487 do_train_dict(): eta: 12:39:02  iter: 39800  speed: 284.3 images/sec  total_norm: 144.4473 (147.2415)  loss: 139.6186 (141.8172)  masked_loss: 1.4489 (1.4903)  tag_loss: 138.0980 (140.3269)  time: 1.4327 (1.8011)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4275 (1.7959)  save_time: 8.8805 (18.1110)  lr: 0.000040  max mem: 26307
2022-03-16 23:41:35,114.114 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-16 23:41:35,114.114 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 118.9981689453125
2022-03-16 23:41:35,115.115 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.94300183556732
2022-03-16 23:41:55,185.185 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02114068530499935
2022-03-16 23:41:55,185.185 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:41:55,186.186 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'with', 'a', 'plate', 'of', 'food', 'that', 'includes', 'soup', 'and', 'chewing', 'sandwich', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:41:55,201.201 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bowl', 'hand', 'shirt', '[UNK]', 'sandwich', 'soup', 'plate', 'bread', 'table', 'spoon', 'man', 'person', 'cup', 'food', 'tomato', 'nose', 'salad', 'restaurant', 'face', 'hair', 'eye', 'jacket', 'head', 'chair', 'glasses', 'wall', 'glass', 'fork', 'woman', 'straw', 'napkin', 'ear', 'handle', 'finger', 'container', 'watch', 'arm', 'basket', 'mouth', 'picture', 'sunglasses', 'logo', 'phone', 'sauce', 'background', 'pot', 'sweater', 'large', 'design', 'ring']
2022-03-16 23:42:11,040.040 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'woman', 'cup', 'hair', 'mouth', 'person', 'table', 'wall', 'food', 'eye', 'chair', 'plant', 'shirt', 'picture', 'nose', 'bowl', 'restaurant', 'plate', 'jacket', 'bread', 'soup', 'sandwich', 'candle', 'lemon', 'spoon', 'tomato']
03-16 23:43:04.589 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-16 23:43:04.589 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-16 23:43:05.750 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 88}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 88}]
2022-03-16 23:44:35,197.197 2829:trainer.py:487 do_train_dict(): eta: 12:36:17  iter: 39900  speed: 283.7 images/sec  total_norm: 144.5613 (147.8947)  loss: 144.5917 (144.3493)  masked_loss: 1.4687 (1.5257)  tag_loss: 143.1203 (142.8236)  time: 1.4343 (1.8045)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4292 (1.7990)  save_time: 8.8805 (18.1110)  lr: 0.000040  max mem: 26307
2022-03-16 23:44:35,557.557 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 23:44:35,558.558 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 135.4141387939453
2022-03-16 23:44:35,558.558 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.94843773841858
2022-03-16 23:44:55,715.715 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0211332980543375
2022-03-16 23:44:55,716.716 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:44:55,716.716 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'holding', 'a', '[MASK]', 'board', 'with', 'people', 'walking', 'behind', 'him', 'near', 'a', 'building', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:44:55,731.731 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hood', 'man', 'jacket', 'glove', 'building', '[UNK]', 'face', 'person', 'nose', 'hand', 'head', 'letter', 'mouth', 'sky', 'eye', 'hat', 'roof', 'shoe', 'jean', 'sign', 'sidewalk', 'ground', 'coat', 'wall', 'boot', 'board', 'stripe', 'backpack', 'window', 'mustache', 'word', 'bag', 'helmet', 'floor', 'finger', 'street', 'woman', 'tree', 'door', 'boy', 'billboard', 'leg', 'brick', 'patch', 'pole', 'bicycle', 'photo', 'pocket', 'picture', 'front']
2022-03-16 23:45:11,656.656 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'line', 'building', 'ground', 'board', 'person', 'eye', 'letter', 'sign', 'sky', 'roof', 'nose', 'bag', 'coat', 'hat', 'jacket', 'hood', 'ski', 'helmet', 'shoe', 'sidewalk', 'tire', 'glove', 'hose', 'mustache']
2022-03-16 23:47:35,975.975 2829:trainer.py:487 do_train_dict(): eta: 12:33:32  iter: 40000  speed: 283.2 images/sec  total_norm: 146.1526 (149.3473)  loss: 140.3300 (143.2892)  masked_loss: 1.5450 (1.5433)  tag_loss: 139.1518 (141.7459)  time: 1.4338 (1.8077)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4285 (1.8026)  save_time: 8.8805 (18.1110)  lr: 0.000040  max mem: 26307
2022-03-16 23:47:35,977.977 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt
2022-03-16 23:47:45,490.490 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-16 23:47:45,490.490 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 164.34422302246094
2022-03-16 23:47:45,491.491 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.95160658876794
2022-03-16 23:48:05,848.848 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021165965124964714
2022-03-16 23:48:05,849.849 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:48:05,849.849 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'sign', 'that', 'has', 'some', 'ice', 'hanging', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:48:05,864.864 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'window', 'sign', 'sky', 'snow', 'wall', 'roof', 'pole', 'ice', '[UNK]', 'post', 'fence', 'brick', 'door', 'street', 'snowy', 'tree', 'side', 'letter', 'top', 'stop', 'design', 'line', 'cloud', 'corner', 'chimney', 'tall', 'person', 'image', 'ledge', 'antenna', 'graffiti', 'front', 'large', 'city', 'paint', 'frame', 'next', 'white', 'board', 'blue', 'wood', 'wire', 'covered', 'light', 'bunch', 'structure', 'arrow', 'water', 'couple']
2022-03-16 23:48:21,665.665 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['building', 'wall', 'ice', 'window', 'sign', 'sky', 'roof', 'snow']
2022-03-16 23:50:44,614.614 2829:trainer.py:487 do_train_dict(): eta: 12:30:52  iter: 40100  speed: 271.4 images/sec  total_norm: 143.2763 (146.1242)  loss: 145.2726 (144.4033)  masked_loss: 1.4612 (1.4940)  tag_loss: 143.9406 (142.9093)  time: 1.4333 (1.8864)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4280 (1.7898)  save_time: 8.8805 (16.9902)  lr: 0.000040  max mem: 26307
2022-03-16 23:50:44,976.976 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-16 23:50:44,976.976 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 105.00901794433594
2022-03-16 23:50:44,976.976 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.96004895072672
2022-03-16 23:51:05,456.456 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021176688373088837
2022-03-16 23:51:05,456.456 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:51:05,457.457 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'different', '[MASK]', 'of', 'animals', 'grazing', '[MASK]', 'a', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:51:05,472.472 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hill', 'zebra', 'field', 'sky', 'head', 'bush', 'grass', 'animal', 'tree', 'tail', 'herd', 'cow', '[UNK]', 'horn', 'mane', 'leg', 'buffalo', 'group', 'stripe', 'hillside', 'shadow', 'plain', 'ear', 'grassy', 'cloud', 'horse', 'wild', 'open', 'bird', 'horizon', 'bunch', 'other', 'many', 'green', 'nose', 'dry', 'large', 'goat', 'number', 'tall', 'top', 'day', 'grazing', 'elephant', 'couple', 'savannah', 'sunny', 'next', 'face', 'few']
2022-03-16 23:51:21,350.350 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'different', 'field', 'hill', 'tree', 'sky', 'animal', 'grass', 'tail', 'bush', 'plain', 'herd', 'mane', 'zebra']
2022-03-16 23:53:45,064.064 2829:trainer.py:487 do_train_dict(): eta: 12:28:07  iter: 40200  speed: 283.7 images/sec  total_norm: 143.9652 (148.0750)  loss: 140.2942 (140.5746)  masked_loss: 1.5093 (1.5210)  tag_loss: 138.3478 (139.0536)  time: 1.4329 (1.8046)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4279 (1.7994)  save_time: 8.8805 (16.9902)  lr: 0.000039  max mem: 26307
2022-03-16 23:53:45,427.427 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-16 23:53:45,428.428 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 143.53134155273438
2022-03-16 23:53:45,428.428 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.97026062958294
2022-03-16 23:54:05,807.807 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021170616149902344
2022-03-16 23:54:05,808.808 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:54:05,808.808 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'riding', 'a', 'skate', '##board', '[MASK]', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:54:05,823.823 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['road', 'line', 'street', 'building', '[UNK]', 'sidewalk', 'ground', 'pole', 'curb', 'sign', 'sky', 'tree', 'wall', 'window', 'shoe', 'step', 'man', 'shirt', 'shadow', 'light', 'car', 'head', 'door', 'bush', 'wheel', 'leg', 'stair', 'person', 'hand', 'fence', 'jean', 'tire', 'house', 'post', 'fire', 'hair', 'roof', 'railing', 'grass', 'letter', 'truck', 'face', 'traffic', 'hat', 'boy', 'short', 'city', 'arm', 'wire', 'stop']
2022-03-16 23:54:21,791.791 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'building', 'door', 'road', 'ground', 'wall', 'arm', 'boy', 'window', 'step', 'letter', 'sign', 'jean', 'shirt', 'leg', 'wheel', 'hat', 'cap', 'reflection', 'shoe', 'sidewalk', 'curb']
2022-03-16 23:56:45,551.551 2829:trainer.py:487 do_train_dict(): eta: 12:25:22  iter: 40300  speed: 283.7 images/sec  total_norm: 144.4516 (146.2000)  loss: 139.0191 (140.5680)  masked_loss: 1.4820 (1.5087)  tag_loss: 137.3365 (139.0593)  time: 1.4330 (1.8048)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4281 (1.7998)  save_time: 8.8805 (16.9902)  lr: 0.000039  max mem: 26307
2022-03-16 23:56:45,912.912 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5833333134651184
2022-03-16 23:56:45,913.913 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 161.34767150878906
2022-03-16 23:56:45,913.913 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.97349332110717
2022-03-16 23:57:06,432.432 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02123328112065792
2022-03-16 23:57:06,432.432 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-16 23:57:06,433.433 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bear', 'rolling', 'on', 'his', 'back', 'on', 'some', 'logs', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-16 23:57:06,449.449 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bear', 'rock', 'log', 'head', 'ear', 'grass', 'nose', 'bush', 'branch', 'ground', 'tree', 'leaf', 'plant', 'animal', 'trunk', 'leg', 'paw', 'snout', 'eye', 'fur', 'face', 'mouth', 'black', 'wood', 'tail', 'dirt', 'cat', 'wall', 'large', 'brown', 'back', 'cub', 'field', 'stick', 'forest', 'zoo', 'enclosure', 'claw', 'weed', 'neck', '[UNK]', 'knot', 'fence', 'stone', 'tongue', 'pole', 'flower', 'water', 'dog', 'hole']
2022-03-16 23:57:22,334.334 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['back', 'head', 'ground', 'rock', 'wall', 'plant', 'tree', 'branch', 'animal', 'leg', 'nose', 'ear', 'bear', 'cat', 'rolling', 'grass', 'bush', 'fur', 'leaf', 'trunk', 'log', 'zoo', 'claw', 'paw']
2022-03-16 23:59:46,400.400 2829:trainer.py:487 do_train_dict(): eta: 12:22:37  iter: 40400  speed: 283.1 images/sec  total_norm: 146.2662 (148.4430)  loss: 146.1528 (145.9936)  masked_loss: 1.4482 (1.4997)  tag_loss: 144.6334 (144.4938)  time: 1.4334 (1.8085)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4282 (1.8033)  save_time: 8.8805 (16.9902)  lr: 0.000039  max mem: 26307
2022-03-16 23:59:46,762.762 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5714285969734192
2022-03-16 23:59:46,762.762 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 151.17861938476562
2022-03-16 23:59:46,762.762 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.97270175321603
2022-03-17 00:00:07,373.373 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021215716376900673
2022-03-17 00:00:07,373.373 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:00:07,373.373 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'little', 'girl', 'in', 'a', 'pink', 'and', '[MASK]', 'dress', '[MASK]', 'her', 'arm', '[MASK]', 'and', 'a', 'kite', 'flies', 'in', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:00:07,389.389 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'kite', 'water', 'dress', 'arm', 'grass', 'flower', 'girl', 'hair', 'boat', 'hand', 'child', 'building', 'tail', 'string', 'dirt', 'ground', 'tree', 'hill', 'city', '[UNK]', 'house', 'sand', 'beach', 'shadow', 'handle', 'sidewalk', 'head', 'little', 'young', 'leg', 'path', 'person', 'woman', 'bracelet', 'ribbon', 'bow', 'skirt', 'watch', 'shore', 'baby', 'elbow', 'bush', 'lake', 'body', 'ball', 'weed', 'plant', 'wrist', 'field']
2022-03-17 00:00:23,395.395 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'house', 'hand', 'little', 'water', 'building', 'hair', 'girl', 'blue', 'green', 'child', 'arm', 'tree', 'beach', 'sky', 'boat', 'dress', 'handle', 'pink', 'string', 'sand', 'grass', 'tail', 'bush', 'flower', 'dirt', 'skirt', 'sidewalk', 'kite']
2022-03-17 00:02:47,006.006 2829:trainer.py:487 do_train_dict(): eta: 12:19:51  iter: 40500  speed: 283.5 images/sec  total_norm: 147.2135 (150.0210)  loss: 144.4520 (145.5894)  masked_loss: 1.4516 (1.4931)  tag_loss: 143.0449 (144.0964)  time: 1.4330 (1.8061)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0049)  time_gpu: 1.4277 (1.8010)  save_time: 8.8805 (16.9902)  lr: 0.000039  max mem: 26307
2022-03-17 00:02:47,366.366 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-17 00:02:47,367.367 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.04547119140625
2022-03-17 00:02:47,367.367 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.96762747365266
2022-03-17 00:03:07,886.886 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021214559674263
2022-03-17 00:03:07,886.886 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:03:07,886.886 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'table', 'with', 'some', 'bananas', '[MASK]', 'pick', '[MASK]', 'on', 'it', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:03:07,902.902 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['banana', 'table', '[UNK]', 'letter', 'jar', 'bag', 'bottle', 'sign', 'label', 'cloth', 'shirt', 'man', 'person', 'can', 'bunch', 'lid', 'board', 'wall', 'store', 'hand', 'shoe', 'display', 'wheel', 'bananas', 'woman', 'logo', 'cap', 'word', 'towel', 'rack', 'jacket', 'fruit', 'shelf', 'head', 'top', 'box', 'hair', 'backpack', 'market', 'bowl', 'banner', 'picture', 'hat', 'yellow', 'stem', 'poster', 'apple', 'arm', 'bucket', 'spice']
2022-03-17 00:03:23,794.794 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'board', 'design', 'person', 'table', 'wall', 'label', 'picture', 'camera', 'handle', 'wheel', 'jacket', 'logo', 'cloth', 'bunch', 'sleeve', 'lid', 'candle', 'banana', 'jar']
2022-03-17 00:05:47,720.720 2829:trainer.py:487 do_train_dict(): eta: 12:17:06  iter: 40600  speed: 283.3 images/sec  total_norm: 143.6671 (145.1691)  loss: 144.4194 (143.9174)  masked_loss: 1.5077 (1.5380)  tag_loss: 143.0626 (142.3795)  time: 1.4331 (1.8072)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4279 (1.8020)  save_time: 8.8805 (16.9902)  lr: 0.000039  max mem: 26307
2022-03-17 00:05:48,081.081 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5142857432365417
2022-03-17 00:05:48,081.081 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 147.3411865234375
2022-03-17 00:05:48,082.082 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.96891763110712
2022-03-17 00:06:08,528.528 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021212834864854813
2022-03-17 00:06:08,528.528 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:06:08,529.529 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'road', 'is', 'closed', 'off', 'via', 'signage', '[MASK]', 'cones', 'for', 'extra', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:06:08,544.544 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['letter', 'cone', 'sign', 'pole', 'sky', 'street', 'ground', 'car', 'truck', 'sidewalk', 'tree', '[UNK]', 'shadow', 'building', 'word', 'road', 'roof', 'orange', 'light', 'person', 'van', 'parking', 'window', 'man', 'traffic', 'can', 'construction', 'stop', 'background', 'lot', 'base', 'fence', 'flag', 'snow', 'house', 'stripe', 'tire', 'top', 'line', 'suv', 'trash', 'billboard', 'barrel', 'bridge', 'mountain', 'shirt', 'curb', 'post', 'wall', 'bag']
2022-03-17 00:06:24,437.437 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'house', 'line', 'building', 'road', 'street', 'car', 'ground', 'board', 'post', 'wall', 'lot', 'cover', 'window', 'tree', 'letter', 'sign', 'sky', 'protection', 'background', 'roof', 'extra', 'truck', 'parking', 'pole', 'trash', 'tire', 'cone', 'curb', 'chimney', 'weed', 'signage']
2022-03-17 00:08:48,697.697 2829:trainer.py:487 do_train_dict(): eta: 12:14:21  iter: 40700  speed: 282.9 images/sec  total_norm: 145.4901 (148.0564)  loss: 140.8905 (141.4956)  masked_loss: 1.4829 (1.5416)  tag_loss: 139.2438 (139.9540)  time: 1.4329 (1.8097)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4276 (1.8045)  save_time: 8.8805 (16.9902)  lr: 0.000039  max mem: 26307
2022-03-17 00:08:49,057.057 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7272727489471436
2022-03-17 00:08:49,058.058 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 117.97797393798828
2022-03-17 00:08:49,058.058 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.97352829166488
2022-03-17 00:09:09,789.789 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021244116127490997
2022-03-17 00:09:09,789.789 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:09:09,790.790 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'on', 'the', 'pitching', 'mound', 'in', 'a', '"', 'after', 'pitching', '"', 'position', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:09:09,805.805 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'leg', 'dirt', 'shirt', 'uniform', 'head', 'hand', 'grass', 'man', 'letter', 'jersey', 'shoe', 'glove', 'baseball', 'field', 'mound', 'logo', 'arm', 'cap', 'face', 'player', 'hat', 'shadow', 'mouth', 'ear', 'ball', 'ground', 'hair', 'nose', 'name', 'number', 'pitcher', 'stripe', 'sleeve', 'sock', 'helmet', 'patch', 'pitch', 'foot', 'line', 'belt', 'eye', 'person', 'wall', 'game', 'beard', 'home', 'neck', 'finger', 'professional']
2022-03-17 00:09:25,666.666 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'name', 'hand', 'face', 'field', 'position', 'mouth', 'baseball', 'letter', 'shirt', 'jersey', 'leg', 'ear', 'shadow', 'grass', 'hat', 'uniform', 'dirt', 'pitcher', 'logo', 'beard', 'shoe', 'mound', 'pitching', 'glove']
2022-03-17 00:11:49,624.624 2829:trainer.py:487 do_train_dict(): eta: 12:11:36  iter: 40800  speed: 283.0 images/sec  total_norm: 145.7211 (150.5294)  loss: 142.6655 (143.6273)  masked_loss: 1.4637 (1.4953)  tag_loss: 141.0935 (142.1320)  time: 1.4335 (1.8093)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4283 (1.8041)  save_time: 8.8805 (16.9902)  lr: 0.000039  max mem: 26307
2022-03-17 00:11:49,993.993 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-17 00:11:49,993.993 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 121.43850708007812
2022-03-17 00:11:49,993.993 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.98243673273286
2022-03-17 00:12:13,069.069 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02125268056988716
2022-03-17 00:12:13,069.069 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:12:13,070.070 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'laptop', 'sitting', 'on', 'a', '##unda', 'with', 'books', 'beside', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:12:13,085.085 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['laptop', 'keyboard', 'table', 'screen', 'floor', '[UNK]', 'leg', 'key', 'computer', 'book', 'chair', 'wall', 'cord', 'desk', 'paper', 'mouse', 'room', 'box', 'hat', 'pad', 'open', 'notebook', 'top', 'window', 'wire', 'coffee', 'logo', 'ipod', 'door', 'shelf', 'light', 'rug', 'napkin', 'cup', 'wooden', 'pen', 'cd', 'plate', 'person', 'next', 'stool', 'handle', 'phone', 'pillow', 'bowl', 'cable', 'tray', 'small', 'button', 'glass']
2022-03-17 00:12:29,168.168 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'book', 'floor', 'table', 'wall', 'key', 'chair', 'computer', 'box', 'sitting', 'screen', 'leg', 'desk', 'keyboard', 'cord', 'laptop']
03-17 00:13:05.805 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 00:13:05.805 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 00:13:06.929 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 94}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 97}]
2022-03-17 00:14:52,030.030 2829:trainer.py:487 do_train_dict(): eta: 12:08:51  iter: 40900  speed: 280.7 images/sec  total_norm: 147.4822 (149.9739)  loss: 141.3504 (142.6642)  masked_loss: 1.5605 (1.5514)  tag_loss: 139.9812 (141.1129)  time: 1.4326 (1.8240)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4273 (1.8190)  save_time: 8.8805 (16.9902)  lr: 0.000038  max mem: 26307
2022-03-17 00:14:52,391.391 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-17 00:14:52,392.392 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 166.48965454101562
2022-03-17 00:14:52,392.392 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.98645333080756
2022-03-17 00:15:13,240.240 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02129349298775196
2022-03-17 00:15:13,240.240 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:15:13,241.241 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'couple', '[MASK]', 'asian', 'people', 'eating', 'dinner', '[MASK]', 'a', 'restaurant', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:15:13,256.256 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'shirt', 'wall', 'table', 'glass', 'face', 'woman', 'restaurant', 'man', 'chair', 'head', 'hand', 'person', 'straw', 'nose', 'plate', 'glasses', 'eye', 'fork', '[UNK]', 'napkin', 'mouth', 'cup', 'brick', 'arm', 'booth', 'candle', 'seat', 'boy', 'eyebrow', 'salt', 'spoon', 'food', 'ear', 'wine', 'knife', 'drink', 'window', 'pizza', 'phone', 'juice', 'girl', 'bowl', 'menu', 'cake', 'water', 'couch', 'picture', 'logo', 'sign']
2022-03-17 00:15:29,191.191 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'woman', 'hair', 'girl', 'person', 'table', 'wall', 'food', 'seat', 'boy', 'glass', 'couple', 'eye', 'chair', 'shirt', 'nose', 'wine', 'dinner', 'restaurant', 'plate', 'brick', 'knife', 'glasses', 'logo', 'booth', 'fork', 'cake', 'sauce', 'necklace', 'straw', 'candle', 'dessert', 'napkin']
2022-03-17 00:17:52,707.707 2829:trainer.py:487 do_train_dict(): eta: 12:06:05  iter: 41000  speed: 283.4 images/sec  total_norm: 145.9117 (148.7334)  loss: 141.9776 (144.4098)  masked_loss: 1.5249 (1.5139)  tag_loss: 140.9367 (142.8960)  time: 1.4325 (1.8067)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4272 (1.8012)  save_time: 8.8805 (16.9902)  lr: 0.000038  max mem: 26307
2022-03-17 00:17:53,069.069 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5142857432365417
2022-03-17 00:17:53,069.069 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 133.8832550048828
2022-03-17 00:17:53,069.069 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.98577220944593
2022-03-17 00:18:13,771.771 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021356800571084023
2022-03-17 00:18:13,771.771 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:18:13,772.772 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'computer', 'screen', 'with', 'a', 'melting', 'apple', 'on', 'it', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:18:13,787.787 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['computer', 'monitor', 'desk', 'screen', 'wall', 'keyboard', 'table', 'mouse', 'light', 'laptop', 'lamp', '[UNK]', 'stand', 'speaker', 'cord', 'pad', 'base', 'logo', 'wire', 'curtain', 'picture', 'box', 'television', 'room', 'book', 'cup', 'icon', 'front', 'window', 'desktop', 'shelf', 'man', 'top', 'paper', 'phone', 'mug', 'green', 'image', 'hair', 'shade', 'handle', 'clock', 'next', 'glass', 'cat', 'head', 'cell', 'bottle', 'apple', 'shadow']
2022-03-17 00:18:29,640.640 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['light', 'table', 'wall', 'base', 'stand', 'computer', 'screen', 'desk', 'speaker', 'apple', 'mouse', 'monitor', 'shade', 'keyboard', 'lamp', 'cord', 'laptop', 'icon']
2022-03-17 00:20:53,635.635 2829:trainer.py:487 do_train_dict(): eta: 12:03:20  iter: 41100  speed: 283.0 images/sec  total_norm: 148.2693 (150.8415)  loss: 137.9602 (139.7773)  masked_loss: 1.4969 (1.5241)  tag_loss: 136.3296 (138.2532)  time: 1.4321 (1.8094)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4270 (1.8043)  save_time: 8.8805 (16.9902)  lr: 0.000038  max mem: 26307
2022-03-17 00:20:53,996.996 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-17 00:20:53,997.997 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 113.35835266113281
2022-03-17 00:20:53,997.997 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.99389804914159
2022-03-17 00:21:14,729.729 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02137037180364132
2022-03-17 00:21:14,730.730 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:21:14,730.730 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'she', 'is', 'talking', 'on', 'her', 'phone', '[MASK]', 'of', 'the', 'restaurant', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:21:14,746.746 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'window', 'sunglasses', 'woman', 'building', 'door', 'face', 'jean', 'sign', '[UNK]', 'sweater', 'shirt', 'phone', 'store', 'head', 'sidewalk', 'bench', 'hand', 'wall', 'car', 'girl', 'plant', 'cell', 'necklace', 'light', 'tree', 'pole', 'letter', 'banner', 'shadow', 'handle', 'fire', 'arm', 'person', 'chair', 'reflection', 'front', 'pot', 'restaurant', 'ground', 'jacket', 'nose', 'paint', 'glass', 'man', 'umbrella', 'bag', 'street', 'lady', 'glasses']
2022-03-17 00:21:30,687.687 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'face', 'building', 'door', 'woman', 'car', 'hair', 'wall', 'phone', 'plant', 'window', 'watch', 'cell', 'sign', 'jean', 'shirt', 'nose', 'restaurant', 'shadow', 'ceiling', 'reflection', 'banner', 'decoration', 'sidewalk', 'necklace', 'sweater', 'sunglasses']
2022-03-17 00:23:54,581.581 2829:trainer.py:487 do_train_dict(): eta: 12:00:34  iter: 41200  speed: 283.0 images/sec  total_norm: 144.8334 (148.6476)  loss: 138.5683 (139.7458)  masked_loss: 1.3650 (1.4131)  tag_loss: 137.3355 (138.3327)  time: 1.4319 (1.8094)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4267 (1.8042)  save_time: 8.8805 (16.9902)  lr: 0.000038  max mem: 26307
2022-03-17 00:23:54,942.942 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-17 00:23:54,943.943 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 157.1354522705078
2022-03-17 00:23:54,943.943 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.98711486705568
2022-03-17 00:24:15,777.777 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021388206630945206
2022-03-17 00:24:15,777.777 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:24:15,778.778 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'refrigerator', 'and', 'counter', 'in', 'a', 'small', 'absorption', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:24:15,793.793 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'cabinet', '[UNK]', 'shelf', 'ceiling', 'wall', 'kitchen', 'floor', 'door', 'handle', 'refrigerator', 'outlet', 'sink', 'drawer', 'light', 'paper', 'switch', 'stove', 'top', 'tile', 'frame', 'sign', 'room', 'oven', 'wood', 'wooden', 'vent', 'empty', 'glass', 'tree', 'box', 'rack', 'table', 'counter', 'towel', 'bottle', 'large', 'island', 'bar', 'cord', 'mirror', 'fridge', 'board', 'chair', 'view', 'hood', 'hand', 'label', 'cart', 'reflection']
2022-03-17 00:24:31,648.648 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'small', 'room', 'top', 'light', 'floor', 'wall', 'paper', 'window', 'metal', 'kitchen', 'counter', 'frame', 'handle', 'cabinet', 'ceiling', 'sink', 'shelf', 'drawer', 'outlet', 'tile', 'refrigerator']
2022-03-17 00:26:55,796.796 2829:trainer.py:487 do_train_dict(): eta: 11:57:49  iter: 41300  speed: 282.5 images/sec  total_norm: 144.9906 (148.4171)  loss: 141.1563 (141.4852)  masked_loss: 1.4459 (1.5125)  tag_loss: 139.8734 (139.9727)  time: 1.4329 (1.8122)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4278 (1.8070)  save_time: 8.8805 (16.9902)  lr: 0.000038  max mem: 26307
2022-03-17 00:26:56,155.155 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.65625
2022-03-17 00:26:56,156.156 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 133.70407104492188
2022-03-17 00:26:56,156.156 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.99148092408112
2022-03-17 00:27:17,017.017 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021377334371209145
2022-03-17 00:27:17,017.017 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:27:17,018.018 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'couple', 'of', 'wild', 'animals', 'walking', '[MASK]', '[MASK]', 'the', 'day', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:27:17,033.033 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'bird', 'ground', 'elephant', 'field', 'bush', 'grass', 'leg', 'trunk', 'water', 'animal', 'ear', 'tail', 'head', '[UNK]', 'shadow', 'dirt', 'person', 'group', 'log', 'branch', 'duck', 'herd', 'hill', 'large', 'stick', 'wing', 'sheep', 'rock', 'pole', 'flock', 'small', 'grassy', 'bank', 'dog', 'body', 'river', 'couple', 'dry', 'next', 'open', 'white', 'area', 'fence', 'cow', 'man', 'wild', 'car', 'many', 'mound']
2022-03-17 00:27:33,086.086 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['day', 'water', 'field', 'ground', 'person', 'couple', 'structure', 'tree', 'wild', 'bird', 'grass', 'tail', 'bush', 'dirt', 'shelter', 'elephant', 'mound']
2022-03-17 00:29:56,878.878 2829:trainer.py:487 do_train_dict(): eta: 11:55:03  iter: 41400  speed: 282.7 images/sec  total_norm: 146.6220 (149.9863)  loss: 138.0694 (139.8062)  masked_loss: 1.4989 (1.4981)  tag_loss: 136.2042 (138.3082)  time: 1.4334 (1.8108)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4282 (1.8057)  save_time: 8.8805 (16.9902)  lr: 0.000038  max mem: 26307
2022-03-17 00:29:57,238.238 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-17 00:29:57,239.239 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 170.69540405273438
2022-03-17 00:29:57,239.239 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 70.99186012543828
2022-03-17 00:30:18,249.249 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021445048972964287
2022-03-17 00:30:18,249.249 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:30:18,250.250 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'sheep', ',', 'one', 'looking', 'at', 'the', 'camera', ',', 'while', '[MASK]', 'other', 'looks', 'away', 'are', 'in', 'the', 'wilderness', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:30:18,265.265 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['head', 'face', 'leg', 'sheep', 'tree', 'horn', 'fence', 'grass', 'wire', 'ground', 'trunk', 'post', 'ear', 'nose', 'pole', 'field', 'goat', 'bush', 'animal', 'ram', 'rock', 'leaf', 'branch', 'plant', 'mouth', 'standing', 'hill', 'wood', 'green', 'path', 'flower', 'white', 'dirt', 'forest', 'foot', 'fern', '[UNK]', 'log', 'eye', 'stick', 'couple', 'area', 'black', 'tail', 'grassy', 'bird', 'dog', 'top', 'sky', 'hillside']
2022-03-17 00:30:34,174.174 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['other', 'head', 'face', 'field', 'ground', 'post', 'plant', 'tree', 'leg', 'ear', 'bird', 'camera', 'grass', 'pole', 'horn', 'wire', 'sheep', 'fence', 'wilderness', 'goat']
2022-03-17 00:32:57,996.996 2829:trainer.py:487 do_train_dict(): eta: 11:52:17  iter: 41500  speed: 282.7 images/sec  total_norm: 146.9815 (150.6781)  loss: 139.4754 (141.0921)  masked_loss: 1.4573 (1.4615)  tag_loss: 137.8322 (139.6306)  time: 1.4341 (1.8112)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4287 (1.8060)  save_time: 8.8805 (16.9902)  lr: 0.000038  max mem: 26307
2022-03-17 00:32:58,358.358 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-17 00:32:58,359.359 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 118.09123229980469
2022-03-17 00:32:58,359.359 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.00295742658469
2022-03-17 00:33:19,472.472 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021421164274215698
2022-03-17 00:33:19,472.472 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:33:19,473.473 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'several', 'young', 'skate', '##board', '##ers', 'near', 'a', 'puddle', '[MASK]', 'water', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:33:19,488.488 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'building', 'man', '[UNK]', 'person', 'window', 'sky', 'jean', 'boy', 'hat', 'ground', 'sidewalk', 'shoe', 'hair', 'head', 'sign', 'cap', 'reflection', 'street', 'railing', 'city', 'light', 'group', 'hand', 'arm', 'ladder', 'floor', 'pole', 'rail', 'banner', 'fence', 'balcony', 'wheel', 'trick', 'glove', 'young', 'road', 'board', 'billboard', 'water', 'curb', 'car', 'walkway', 'bench', 'woman', 'air', 'skate', 'crane', 'bottle', 'number']
2022-03-17 00:33:35,363.363 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'water', 'building', 'road', 'young', 'ground', 'hair', 'person', 'boy', 'window', 'sign', 'sky', 'jean', 'shirt', 'rail', 'wheel', 'hat', 'cap', 'reflection', 'shoe', 'sidewalk', 'railing', 'puddle']
2022-03-17 00:35:59,084.084 2829:trainer.py:487 do_train_dict(): eta: 11:49:31  iter: 41600  speed: 282.7 images/sec  total_norm: 146.7032 (149.6729)  loss: 143.8828 (142.5958)  masked_loss: 1.4399 (1.4936)  tag_loss: 142.1304 (141.1022)  time: 1.4328 (1.8109)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4278 (1.8058)  save_time: 8.8805 (16.9902)  lr: 0.000037  max mem: 26307
2022-03-17 00:35:59,445.445 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-17 00:35:59,445.445 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.02047729492188
2022-03-17 00:35:59,445.445 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.00820345615597
2022-03-17 00:36:20,612.612 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021423395723104477
2022-03-17 00:36:20,612.612 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:36:20,613.613 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'in', '[MASK]', 'swinging', 'pose', 'with', 'a', 'tennis', 'ra', '##c', '##quet', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:36:20,628.628 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['leg', 'sock', 'shoe', 'short', '[UNK]', 'man', 'tennis', 'court', 'shirt', 'hand', 'arm', 'head', 'wall', 'ball', 'shadow', 'hair', 'player', 'line', 'letter', 'ground', 'logo', 'handle', 'face', 'knee', 'band', 'ear', 'hat', 'male', 'person', 'sign', 'nose', 'stripe', 'writing', 'cap', 'string', 'match', 'mouth', 'beard', 'wrist', 'uniform', 'stand', 'eye', 'white', 'sleeve', 'serve', 'ready', 'air', 'game', 'chair', 'banner']
2022-03-17 00:36:36,580.580 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'court', 'short', 'ground', 'hair', 'person', 'wall', 'arm', 'chair', 'letter', 'shirt', 'leg', 'tennis', 'shadow', 'jacket', 'bench', 'logo', 'shoe', 'swinging', 'pose', 'stripe', 'sock']
2022-03-17 00:39:00,138.138 2829:trainer.py:487 do_train_dict(): eta: 11:46:46  iter: 41700  speed: 282.8 images/sec  total_norm: 144.4076 (145.8507)  loss: 139.5413 (141.3665)  masked_loss: 1.5289 (1.5295)  tag_loss: 138.0480 (139.8371)  time: 1.4334 (1.8105)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4282 (1.8053)  save_time: 8.8805 (16.9902)  lr: 0.000037  max mem: 26307
2022-03-17 00:39:00,499.499 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7058823704719543
2022-03-17 00:39:00,499.499 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 124.20304870605469
2022-03-17 00:39:00,499.499 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.01553769088818
2022-03-17 00:39:21,750.750 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021411824971437454
2022-03-17 00:39:21,750.750 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:39:21,750.750 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'a', 'tennis', 'rack', '##et', 'is', 'standing', 'on', '[MASK]', 'court', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:39:21,765.765 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shoe', '[UNK]', 'line', 'court', 'sock', 'tennis', 'shirt', 'hand', 'ground', 'short', 'woman', 'head', 'leg', 'hair', 'arm', 'handle', 'man', 'logo', 'person', 'girl', 'ball', 'fence', 'wall', 'uniform', 'player', 'net', 'face', 'hat', 'ear', 'tree', 'pole', 'letter', 'sign', 'skirt', 'ponytail', 'cap', 'chair', 'boy', 'jersey', 'stripe', 'top', 'watch', 'bracelet', 'band', 'bag', 'car', 'banner', 'wrist', 'shadow', 'glasses']
2022-03-17 00:39:37,667.667 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'line', 'player', 'court', 'short', 'ground', 'hair', 'wall', 'arm', 'shirt', 'leg', 'background', 'nose', 'ear', 'tennis', 'bottle', 'hat', 'cap', 'logo', 'fence', 'shoe', 'sunglasses', 'stripe', 'sock']
2022-03-17 00:42:01,417.417 2829:trainer.py:487 do_train_dict(): eta: 11:44:00  iter: 41800  speed: 282.4 images/sec  total_norm: 145.6436 (148.5254)  loss: 140.2720 (142.2225)  masked_loss: 1.4315 (1.4636)  tag_loss: 138.9981 (140.7589)  time: 1.4333 (1.8128)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4281 (1.8077)  save_time: 8.8805 (16.9902)  lr: 0.000037  max mem: 26307
2022-03-17 00:42:01,777.777 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-17 00:42:01,777.777 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 135.86753845214844
2022-03-17 00:42:01,778.778 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.02708155556908
2022-03-17 00:42:23,030.030 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02141238935291767
2022-03-17 00:42:23,031.031 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:42:23,031.031 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'large', 'bed', 'with', 'a', 'attached', 'tables', '[MASK]', '[MASK]', 'lights', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:42:23,047.047 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'bed', 'floor', 'room', 'lamp', 'pillow', 'book', '[UNK]', 'table', 'mattress', 'light', 'sheet', 'nightstand', 'blanket', 'window', 'bedroom', 'chair', 'rug', 'vase', 'carpet', 'cushion', 'drawer', 'cup', 'shelf', 'door', 'blind', 'clock', 'tile', 'shade', 'tray', 'leg', 'reflection', 'box', 'cord', 'desk', 'white', 'phone', 'large', 'shadow', 'alarm', 'frame', 'seat', 'flower', 'cabinet', 'paper', 'television', 'remote', 'speaker', 'bottom', 'picture']
2022-03-17 00:42:39,076.076 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'room', 'large', 'book', 'door', 'light', 'cup', 'floor', 'bed', 'wall', 'glass', 'attached', 'sheet', 'blanket', 'item', 'pillow', 'carpet', 'lamp', 'shelf', 'mattress', 'rug', 'cushion']
03-17 00:43:07.021 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 00:43:07.021 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 00:43:08.347 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 00:45:02,627.627 2829:trainer.py:487 do_train_dict(): eta: 11:41:14  iter: 41900  speed: 282.5 images/sec  total_norm: 146.8112 (148.5743)  loss: 145.0779 (144.7114)  masked_loss: 1.5479 (1.5577)  tag_loss: 143.8079 (143.1536)  time: 1.4322 (1.8121)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4271 (1.8070)  save_time: 8.8805 (16.9902)  lr: 0.000037  max mem: 26307
2022-03-17 00:45:02,989.989 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-17 00:45:02,990.990 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 122.1563491821289
2022-03-17 00:45:02,990.990 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.04008935292562
2022-03-17 00:45:24,171.171 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021418746560811996
2022-03-17 00:45:24,171.171 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:45:24,172.172 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'couple', 'of', 'different', 'soccer', 'players', 'are', 'competing', '[MASK]', 'the', 'field', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:45:24,187.187 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['short', 'hair', 'shirt', 'ball', 'sock', 'shoe', 'man', 'soccer', 'uniform', 'grass', 'hand', 'arm', 'tree', 'field', 'bag', 'head', 'backpack', 'jersey', 'logo', 'boy', 'leg', 'face', 'number', 'player', 'ground', 'line', 'game', 'pole', 'camera', '[UNK]', 'person', 'knee', 'background', 'young', 'stripe', 'group', 'fence', 'couple', 'blue', 'bottle', 'other', 'mouth', 'watch', 'back', 'white', 'trash', 'air', 'ear', 'bracelet', 'glove']
2022-03-17 00:45:40,127.127 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['back', 'head', 'man', 'hand', 'number', 'different', 'player', 'short', 'field', 'ground', 'hair', 'arm', 'boy', 'couple', 'tree', 'ball', 'shirt', 'jersey', 'leg', 'background', 'bag', 'soccer', 'grass', 'uniform', 'logo', 'shoe', 'backpack', 'sock']
2022-03-17 00:48:03,916.916 2829:trainer.py:487 do_train_dict(): eta: 11:38:28  iter: 42000  speed: 282.4 images/sec  total_norm: 145.9572 (148.6094)  loss: 143.3673 (143.4991)  masked_loss: 1.4955 (1.4897)  tag_loss: 141.8718 (142.0094)  time: 1.4315 (1.8129)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4261 (1.8077)  save_time: 8.8805 (16.9902)  lr: 0.000037  max mem: 26307
2022-03-17 00:48:04,278.278 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7142857313156128
2022-03-17 00:48:04,278.278 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 145.319091796875
2022-03-17 00:48:04,278.278 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.04135333897099
2022-03-17 00:48:25,488.488 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02141851745545864
2022-03-17 00:48:25,489.489 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:48:25,489.489 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'tooth', 'brush', 'and', 'tube', 'of', 'tooth', 'paste', 'on', 'glass', 'surface', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:48:25,505.505 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'shelf', '[UNK]', 'handle', 'table', 'brush', 'glass', 'knife', 'object', 'tile', 'blade', 'cabinet', 'top', 'bottle', 'mirror', 'scissors', 'line', 'base', 'frame', 'ledge', 'white', 'door', 'board', 'microwave', 'kitchen', 'window', 'sink', 'counter', 'head', 'spoon', 'plate', 'water', 'shadow', 'container', 'panel', 'light', 'dish', 'leaf', 'button', 'small', 'reflection', 'vase', 'bar', 'screw', 'cup', 'tooth', 'drawer', 'clock', 'plant', 'hole']
2022-03-17 00:48:41,389.389 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'door', 'table', 'wall', 'glass', 'surface', 'label', 'shadow', 'tube', 'brush', 'shelf', 'screw', 'tooth', 'paste']
2022-03-17 00:51:05,172.172 2829:trainer.py:487 do_train_dict(): eta: 11:35:42  iter: 42100  speed: 282.5 images/sec  total_norm: 147.9737 (150.6611)  loss: 140.9359 (141.5875)  masked_loss: 1.4245 (1.4539)  tag_loss: 139.3352 (140.1336)  time: 1.4322 (1.8126)  data: 0.0001 (0.0005)  to_device: 0.0050 (0.0049)  time_gpu: 1.4270 (1.8072)  save_time: 8.8805 (16.9902)  lr: 0.000037  max mem: 26307
2022-03-17 00:51:05,533.533 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-17 00:51:05,534.534 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 181.57708740234375
2022-03-17 00:51:05,534.534 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.03069684177778
2022-03-17 00:51:27,005.005 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021461354568600655
2022-03-17 00:51:27,005.005 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:51:27,005.005 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', 'airport', 'with', 'a', 'large', 'white', 'passenger', 'jet', 'sitting', '[MASK]', 'a', 'tar', '##mac', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:51:27,021.021 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'sky', 'airport', 'airplane', 'floor', 'engine', 'wing', 'tail', 'building', '[UNK]', 'carpet', 'pole', 'ground', 'cloud', 'wall', 'truck', 'vehicle', 'person', 'runway', 'cart', 'wheel', 'man', 'car', 'terminal', 'cone', 'large', 'body', 'front', 'windshield', 'cockpit', 'chair', 'light', 'door', 'luggage', 'van', 'frame', 'plane', 'stair', 'bus', 'logo', 'gate', 'box', 'sign', 'nose', 'leg', 'seat', 'line', 'shadow', 'city', 'passenger']
2022-03-17 00:51:42,965.965 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'body', 'building', 'large', 'white', 'football', 'car', 'ground', 'person', 'floor', 'wall', 'engine', 'airport', 'window', 'wing', 'sky', 'vehicle', 'passenger', 'truck', 'shadow', 'wheel', 'terminal', 'tail', 'cloud', 'pole', 'jet', 'runway', 'carpet', 'balcony', 'airplane', 'windshield']
2022-03-17 00:54:06,555.555 2829:trainer.py:487 do_train_dict(): eta: 11:32:56  iter: 42200  speed: 282.3 images/sec  total_norm: 148.0372 (150.4387)  loss: 142.9486 (143.4243)  masked_loss: 1.5630 (1.5307)  tag_loss: 141.3828 (141.8936)  time: 1.4329 (1.8138)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4278 (1.8086)  save_time: 8.8805 (16.9902)  lr: 0.000036  max mem: 26307
2022-03-17 00:54:06,917.917 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7058823704719543
2022-03-17 00:54:06,917.917 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 135.8490753173828
2022-03-17 00:54:06,917.917 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.03660656543488
2022-03-17 00:54:28,390.390 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021487630903720856
2022-03-17 00:54:28,391.391 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:54:28,391.391 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'kitchen', 'is', 'displayed', 'in', 'a', 'house', 'with', 'wooden', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:54:28,407.407 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cabinet', 'kitchen', 'microwave', '[UNK]', 'handle', 'door', 'wall', 'outlet', 'oven', 'drawer', 'maker', 'coffee', 'refrigerator', 'cord', 'sink', 'bottle', 'stove', 'towel', 'window', 'ceiling', 'light', 'floor', 'pot', 'paper', 'control', 'kettle', 'counter', 'panel', 'container', 'clock', 'plug', 'jar', 'top', 'display', 'steel', 'knob', 'cup', 'picture', 'glass', 'box', 'knife', 'block', 'bowl', 'book', 'stainless', 'lid', 'magnet', 'telephone', 'can', 'white']
2022-03-17 00:54:44,364.364 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'house', 'door', 'board', 'table', 'wall', 'window', 'kitchen', 'coffee', 'wooden', 'handle', 'cabinet', 'bottle', 'pan', 'sink', 'soap', 'glasses', 'maker', 'drawer', 'outlet', 'jar', 'stove', 'oven', 'microwave']
2022-03-17 00:57:08,188.188 2829:trainer.py:487 do_train_dict(): eta: 11:30:09  iter: 42300  speed: 281.9 images/sec  total_norm: 148.8757 (150.3574)  loss: 138.9896 (139.6231)  masked_loss: 1.5204 (1.5630)  tag_loss: 137.2247 (138.0601)  time: 1.4339 (1.8163)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4287 (1.8111)  save_time: 8.8805 (16.9902)  lr: 0.000036  max mem: 26307
2022-03-17 00:57:08,548.548 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6857143044471741
2022-03-17 00:57:08,549.549 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 130.0687255859375
2022-03-17 00:57:08,549.549 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.04761534816814
2022-03-17 00:57:29,971.971 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021544501185417175
2022-03-17 00:57:29,972.972 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 00:57:29,972.972 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'mountain', 'biker', 'pumps', 'his', '[MASK]', 'in', 'celebration', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 00:57:29,987.987 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['ground', 'bush', 'tree', 'dirt', 'rock', '[UNK]', 'sky', 'head', 'leg', 'hill', 'branch', 'shirt', 'road', 'shoe', 'stick', 'tail', 'wheel', 'grass', 'arm', 'hat', 'ear', 'brush', 'pole', 'bottle', 'man', 'truck', 'tire', 'hand', 'person', 'bike', 'bag', 'top', 'mountain', 'cloud', 'bench', 'motorcycle', 'face', 'handle', 'woman', 'jean', 'wood', 'jacket', 'trunk', 'short', 'bicycle', 'hair', 'field', 'cap', 'backpack', 'horse']
2022-03-17 00:57:45,997.997 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'road', 'short', 'ground', 'arm', 'hill', 'mountain', 'tree', 'sky', 'shirt', 'leg', 'wheel', 'bush', 'dirt', 'fist', 'celebration', 'bike', 'bicycle', 'helmet', 'shoe', 'glove', 'sunglasses', 'sock', 'biker']
2022-03-17 01:00:09,714.714 2829:trainer.py:487 do_train_dict(): eta: 11:27:23  iter: 42400  speed: 282.1 images/sec  total_norm: 144.9691 (149.5484)  loss: 141.5725 (143.4463)  masked_loss: 1.4272 (1.4615)  tag_loss: 139.9841 (141.9847)  time: 1.4331 (1.8153)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4279 (1.8101)  save_time: 8.8805 (16.9902)  lr: 0.000036  max mem: 26307
2022-03-17 01:00:10,075.075 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4571428596973419
2022-03-17 01:00:10,075.075 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 145.24624633789062
2022-03-17 01:00:10,076.076 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.0422955052993
2022-03-17 01:00:31,773.773 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02159140259027481
2022-03-17 01:00:31,773.773 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:00:31,774.774 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'yellow', 'flower', 'emerges', 'from', 'a', 'blue', 'vase', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:00:31,789.789 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['vase', 'tile', 'wall', 'flower', '[UNK]', 'sink', 'stem', 'leaf', 'bathroom', 'water', 'blue', 'bottle', 'glass', 'mirror', 'table', 'line', 'counter', 'reflection', 'base', 'clear', 'handle', 'top', 'light', 'container', 'shelf', 'cabinet', 'kitchen', 'bottom', 'door', 'background', 'rack', 'window', 'floor', 'picture', 'white', 'cap', 'ring', 'soap', 'shadow', 'hand', 'paper', 'next', 'towel', 'tiled', 'holder', 'plant', 'shirt', 'knob', 'ledge', 'man']
2022-03-17 01:00:47,721.721 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'water', 'top', 'blue', 'wall', 'base', 'yellow', 'bathroom', 'flower', 'leaf', 'sole', 'stem', 'reflection', 'container', 'tile', 'ledge', 'vase']
2022-03-17 01:03:11,397.397 2829:trainer.py:487 do_train_dict(): eta: 11:24:37  iter: 42500  speed: 281.8 images/sec  total_norm: 146.7993 (149.5911)  loss: 143.1370 (143.0844)  masked_loss: 1.4714 (1.5115)  tag_loss: 141.5977 (141.5729)  time: 1.4322 (1.8169)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4271 (1.8117)  save_time: 8.8805 (16.9902)  lr: 0.000036  max mem: 26307
2022-03-17 01:03:11,758.758 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-17 01:03:11,758.758 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 115.534423828125
2022-03-17 01:03:11,758.758 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.05517115167609
2022-03-17 01:03:33,164.164 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021591635420918465
2022-03-17 01:03:33,164.164 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:03:33,164.164 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'person', 'in', 'an', 'ocean', 'that', 'is', 'falling', 'off', 'his', 'surf', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:03:33,180.180 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wave', 'water', '[UNK]', 'ocean', 'head', 'arm', 'man', 'hand', 'surfer', 'hair', 'sky', 'leg', 'person', 'foot', 'board', 'short', 'shirt', 'foam', 'suit', 'wet', 'top', 'logo', 'surf', 'shore', 'mountain', 'beach', 'face', 'reflection', 'design', 'white', 'ripple', 'horizon', 'watch', 'woman', 'small', 'back', 'large', 'rock', 'body', 'blue', 'wake', 'boat', 'name', 'sea', 'spray', 'trunk', 'big', 'crest', 'cloud', 'fin']
2022-03-17 01:03:49,040.040 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'water', 'board', 'hair', 'person', 'ocean', 'wave', 'logo', 'ripple', 'surfer']
2022-03-17 01:06:12,894.894 2829:trainer.py:487 do_train_dict(): eta: 11:21:51  iter: 42600  speed: 282.1 images/sec  total_norm: 146.3640 (148.7817)  loss: 141.2117 (141.8549)  masked_loss: 1.4163 (1.4640)  tag_loss: 139.8284 (140.3910)  time: 1.4330 (1.8150)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4277 (1.8097)  save_time: 8.8805 (16.9902)  lr: 0.000036  max mem: 26307
2022-03-17 01:06:13,258.258 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-17 01:06:13,258.258 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 125.75096893310547
2022-03-17 01:06:13,258.258 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.06623224798913
2022-03-17 01:06:34,853.853 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021602030843496323
2022-03-17 01:06:34,853.853 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:06:34,854.854 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'people', 'are', 'watching', 'a', '[MASK]', 'doing', 'his', 'thing', 'on', 'a', 'skate', '##board', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:06:34,869.869 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'wheel', 'shirt', 'hand', 'ramp', 'person', 'park', 'arm', 'boy', 'man', 'shoe', 'short', 'skate', 'bowl', 'sock', 'pole', 'helmet', 'head', 'sky', 'knee', 'hat', 'tree', 'pad', 'building', 'fence', 'logo', 'background', 'skater', 'car', 'leg', 'sign', 'trick', 'board', 'pool', 'railing', 'light', 'umbrella', 'cap', 'tent', 'can', 'ground', 'rim', 'beach', 'bicycle', 'table', 'truck', 'shadow', 'hair', 'child', 'cloud']
2022-03-17 01:06:50,780.780 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'park', 'woman', 'short', 'thing', 'person', 'child', 'table', 'arm', 'boy', 'guy', 'tree', 'shirt', 'background', 'bowl', 'camera', 'wheel', 'hat', 'knee', 'pole', 'tent', 'helmet', 'shoe', 'pad', 'umbrella', 'ramp', 'railing', 'skate', 'sock']
2022-03-17 01:09:14,586.586 2829:trainer.py:487 do_train_dict(): eta: 11:19:05  iter: 42700  speed: 281.8 images/sec  total_norm: 144.8645 (148.6564)  loss: 141.0954 (143.3357)  masked_loss: 1.4752 (1.4932)  tag_loss: 139.7748 (141.8425)  time: 1.4333 (1.8169)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4281 (1.8116)  save_time: 8.8805 (16.9902)  lr: 0.000036  max mem: 26307
2022-03-17 01:09:14,947.947 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-17 01:09:14,947.947 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.65634155273438
2022-03-17 01:09:14,947.947 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.07369912227738
2022-03-17 01:09:36,365.365 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02159293182194233
2022-03-17 01:09:36,365.365 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:09:36,366.366 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'sandwich', 'is', 'seen', '[MASK]', 'a', 'paper', 'plate', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:09:36,381.381 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sandwich', 'bread', 'table', 'plate', 'cheese', 'meat', 'food', '[UNK]', 'handle', 'crust', 'bacon', 'onion', 'bowl', 'fork', 'cup', 'napkin', 'egg', 'shadow', 'container', 'glass', 'mug', 'cut', 'white', 'coffee', 'knife', 'spoon', 'paper', 'half', 'top', 'sauce', 'tomato', 'toast', 'jar', 'hole', 'logo', 'french', 'bottle', 'chip', 'liquid', 'ham', 'rim', 'piece', 'close', 'blade', 'dish', 'leaf', 'butter', 'lid', 'bottom', 'cloth']
2022-03-17 01:09:52,326.326 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['table', 'food', 'key', 'paper', 'plate', 'meat', 'button', 'wire', 'bread', 'mouse', 'keyboard', 'cord', 'sandwich', 'crust', 'napkin']
2022-03-17 01:12:16,282.282 2829:trainer.py:487 do_train_dict(): eta: 11:16:18  iter: 42800  speed: 281.8 images/sec  total_norm: 146.3325 (148.7189)  loss: 144.5788 (144.6105)  masked_loss: 1.4489 (1.5052)  tag_loss: 143.1053 (143.1054)  time: 1.4324 (1.8170)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4272 (1.8117)  save_time: 8.8805 (16.9902)  lr: 0.000036  max mem: 26307
2022-03-17 01:12:16,641.641 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-17 01:12:16,641.641 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.23428344726562
2022-03-17 01:12:16,641.641 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.07512057466663
2022-03-17 01:12:38,319.319 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02159733511507511
2022-03-17 01:12:38,319.319 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:12:38,320.320 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'little', 'boy', 'and', '[MASK]', '[MASK]', 'looking', 'at', 'a', 'police', 'motorcycle', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:12:38,335.335 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['motorcycle', 'shirt', '[UNK]', 'light', 'tire', 'bike', 'hair', 'head', 'shoe', 'ground', 'boy', 'man', 'fender', 'hand', 'tank', 'short', 'shadow', 'tree', 'wheel', 'sidewalk', 'road', 'person', 'street', 'boot', 'arm', 'bag', 'jean', 'helmet', 'child', 'pole', 'building', 'sky', 'woman', 'seat', 'engine', 'pipe', 'car', 'leg', 'sunglasses', 'line', 'bush', 'fence', 'curb', 'wall', 'grass', 'mirror', 'windshield', 'gas', 'sign', 'window']
2022-03-17 01:12:54,300.300 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'little', 'face', 'father', 'building', 'road', 'light', 'short', 'ground', 'hair', 'police', 'word', 'child', 'boy', 'engine', 'window', 'tree', 'letter', 'sky', 'shirt', 'ear', 'shadow', 'flag', 'wheel', 'mirror', 'bench', 'horn', 'bike', 'fence', 'motorcycle', 'boot', 'skirt', 'shoe', 'tire', 'sunglasses', 'fender', 'windshield']
03-17 01:13:08.389 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 01:13:08.389 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 01:13:09.534 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 01:15:17,978.978 2829:trainer.py:487 do_train_dict(): eta: 11:13:32  iter: 42900  speed: 281.8 images/sec  total_norm: 145.3660 (148.1841)  loss: 141.8306 (142.2851)  masked_loss: 1.5162 (1.5272)  tag_loss: 140.3550 (140.7579)  time: 1.4316 (1.8170)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4264 (1.8118)  save_time: 8.8805 (16.9902)  lr: 0.000035  max mem: 26307
2022-03-17 01:15:18,339.339 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-17 01:15:18,339.339 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 167.22744750976562
2022-03-17 01:15:18,339.339 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.06134655087493
2022-03-17 01:15:40,066.066 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02160204015672207
2022-03-17 01:15:40,066.066 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:15:40,067.067 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'kitchen', 'that', 'has', '[MASK]', 'pink', 'cups', 'on', 'the', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:15:40,083.083 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'wall', 'stove', 'kitchen', 'cup', 'oven', 'knob', 'handle', 'door', 'cabinet', 'bottle', 'mug', 'top', 'shelf', 'jar', 'floor', 'coffee', 'lid', 'tile', 'window', 'pot', 'container', 'drawer', 'rack', 'spoon', 'sink', 'outlet', 'bowl', 'counter', 'bucket', 'pole', 'table', 'cord', 'box', 'refrigerator', 'kettle', 'glass', 'microwave', 'pitcher', 'white', 'wire', 'dish', 'plate', 'fan', 'can', 'fire', 'towel', 'small', 'cap', 'display']
2022-03-17 01:15:55,979.979 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'top', 'door', 'cup', 'floor', 'wall', 'kitchen', 'counter', 'handle', 'cabinet', 'fan', 'bottle', 'liquid', 'lid', 'bucket', 'rack', 'mug', 'stove', 'coaster', 'knob', 'oven', 'refrigerator', 'kettle']
2022-03-17 01:18:19,724.724 2829:trainer.py:487 do_train_dict(): eta: 11:10:45  iter: 43000  speed: 281.7 images/sec  total_norm: 145.9306 (147.8647)  loss: 136.6767 (137.7407)  masked_loss: 1.4589 (1.4774)  tag_loss: 134.9663 (136.2633)  time: 1.4332 (1.8174)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4280 (1.8123)  save_time: 8.8805 (16.9902)  lr: 0.000035  max mem: 26307
2022-03-17 01:18:20,084.084 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-17 01:18:20,084.084 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.02027893066406
2022-03-17 01:18:20,084.084 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.06931020820777
2022-03-17 01:18:41,731.731 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021593688055872917
2022-03-17 01:18:41,731.731 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:18:41,731.731 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'small', 'kid', 'and', 'a', 'person', 'with', '[MASK]', 'tooth', '##brush', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:18:41,747.747 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['mouth', 'eye', 'nose', 'hair', 'face', 'hand', 'girl', 'wall', 'child', '[UNK]', 'finger', 'ear', 'towel', 'forehead', 'person', 'teeth', 'arm', 'shirt', 'nail', 'head', 'cheek', 'sleeve', 'little', 'young', 'eyebrow', 'tongue', 'bathroom', 'handle', 'ring', 'neck', 'tile', 'brush', 'thumb', 'door', 'woman', 'boy', 'bang', 'chest', 'curtain', 'toilet', 'floor', 'blanket', 'knob', 'small', 'baby', 'kid', 'chin', 'toy', 'holder', 'tub']
2022-03-17 01:18:57,654.654 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'face', 'small', 'hair', 'girl', 'mouth', 'person', 'child', 'wall', 'eye', 'ring', 'shirt', 'finger', 'nose', 'ear', 'kid', 'handle', 'button', 'holder', 'sleeve', 'cuff']
2022-03-17 01:21:21,610.610 2829:trainer.py:487 do_train_dict(): eta: 11:07:59  iter: 43100  speed: 281.5 images/sec  total_norm: 147.0580 (148.7670)  loss: 141.0400 (141.6718)  masked_loss: 1.4622 (1.4708)  tag_loss: 140.0831 (140.2011)  time: 1.4334 (1.8189)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4282 (1.8137)  save_time: 8.8805 (16.9902)  lr: 0.000035  max mem: 26307
2022-03-17 01:21:21,970.970 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5714285969734192
2022-03-17 01:21:21,971.971 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 156.8335723876953
2022-03-17 01:21:21,971.971 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.06686391653838
2022-03-17 01:21:43,713.713 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021623732522130013
2022-03-17 01:21:43,713.713 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:21:43,714.714 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', '[MASK]', 'counter', 'and', 'sink', '[MASK]', 'dishes', 'on', 'them', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:21:43,729.729 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['kitchen', '[UNK]', 'cabinet', 'wall', 'window', 'handle', 'bottle', 'sink', 'stove', 'bowl', 'door', 'chair', 'oven', 'knob', 'lid', 'cup', 'floor', 'knife', 'drawer', 'pot', 'towel', 'dish', 'rack', 'board', 'tile', 'box', 'block', 'top', 'cutting', 'shelf', 'basket', 'spoon', 'sponge', 'pipe', 'plate', 'container', 'mug', 'bag', 'picture', 'paper', 'washing', 'white', 'refrigerator', 'hood', 'plant', 'jar', 'pan', 'pitcher', 'soap', 'counter']
2022-03-17 01:21:59,641.641 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'top', 'cup', 'design', 'floor', 'wall', 'chair', 'window', 'kitchen', 'picture', 'bowl', 'counter', 'handle', 'plate', 'cabinet', 'knife', 'bottle', 'sink', 'pipe', 'shade', 'pot', 'holder', 'dish', 'towel', 'basket', 'lid', 'drawer', 'rack', 'spoon', 'stove', 'knob', 'oven', 'rug']
2022-03-17 01:24:23,682.682 2829:trainer.py:487 do_train_dict(): eta: 11:05:12  iter: 43200  speed: 281.2 images/sec  total_norm: 145.5534 (147.8206)  loss: 139.4266 (140.4954)  masked_loss: 1.5146 (1.5144)  tag_loss: 138.0130 (138.9810)  time: 1.4345 (1.8208)  data: 0.0001 (0.0005)  to_device: 0.0052 (0.0050)  time_gpu: 1.4292 (1.8153)  save_time: 8.8805 (16.9902)  lr: 0.000035  max mem: 26307
2022-03-17 01:24:24,044.044 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-17 01:24:24,044.044 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.73623657226562
2022-03-17 01:24:24,044.044 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.0748316300108
2022-03-17 01:24:46,006.006 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021662456914782524
2022-03-17 01:24:46,007.007 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:24:46,007.007 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'den', 'with', 'a', 'table', ',', 'couch', '[MASK]', 'television', 'and', '[MASK]', '##s', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:24:46,022.022 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'table', 'room', 'floor', 'television', 'glass', 'picture', 'cabinet', 'couch', 'chair', 'ceiling', 'bowl', 'living', 'door', 'coffee', 'pillow', 'shelf', 'lamp', 'center', 'entertainment', 'plant', 'light', 'book', 'shade', 'window', 'speaker', 'stand', 'plate', 'sofa', 'drawer', 'rug', 'vase', '[UNK]', 'clock', 'reflection', 'frame', 'curtain', 'furniture', 'cushion', 'painting', 'flower', 'candle', 'screen', 'dresser', 'mirror', 'base', 'tray', 'outlet', 'large', 'wooden']
2022-03-17 01:25:02,001.001 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['room', 'door', 'light', 'cup', 'television', 'floor', 'table', 'wall', 'glass', 'chair', 'plant', 'figure', 'window', 'picture', 'coffee', 'leg', 'bowl', 'clock', 'cabinet', 'speaker', 'ceiling', 'couch', 'flower', 'remote', 'sculpture', 'switch', 'den', 'pot', 'pillow', 'curtain', 'shelf', 'drawer', 'cushion']
2022-03-17 01:27:25,522.522 2829:trainer.py:487 do_train_dict(): eta: 11:02:26  iter: 43300  speed: 281.6 images/sec  total_norm: 145.5002 (149.2901)  loss: 139.2665 (141.3544)  masked_loss: 1.5310 (1.5388)  tag_loss: 137.6039 (139.8156)  time: 1.4333 (1.8183)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4280 (1.8131)  save_time: 8.8805 (16.9902)  lr: 0.000035  max mem: 26307
2022-03-17 01:27:25,882.882 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5714285969734192
2022-03-17 01:27:25,882.882 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 142.45013427734375
2022-03-17 01:27:25,882.882 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.07677402584234
2022-03-17 01:27:47,679.679 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021678118035197258
2022-03-17 01:27:47,679.679 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:27:47,680.680 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'sun', 'sets', '[MASK]', 'a', 'vacant', 'sienna', 'area', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:27:47,695.695 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'pole', 'light', 'building', 'tree', 'street', 'sun', 'traffic', 'sign', 'fence', 'sunset', 'roof', 'city', 'car', 'person', '[UNK]', 'road', 'bird', 'window', 'tower', 'dusk', 'line', 'van', 'sidewalk', 'night', 'ground', 'top', 'trunk', 'truck', 'parking', 'branch', 'pillar', 'chimney', 'bench', 'wire', 'lot', 'telephone', 'man', 'arrow', 'intersection', 'bicycle', 'horizon', 'lamp', 'bus', 'box', 'hill', 'stop', 'post', 'fire', 'statue']
2022-03-17 01:28:03,646.646 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['house', 'area', 'building', 'street', 'light', 'ground', 'arm', 'sun', 'window', 'tree', 'sign', 'sky', 'industrial', 'traffic', 'roof', 'truck', 'grass', 'pole', 'fence', 'vacant', 'balcony']
2022-03-17 01:30:27,494.494 2829:trainer.py:487 do_train_dict(): eta: 10:59:39  iter: 43400  speed: 281.4 images/sec  total_norm: 146.2397 (150.1466)  loss: 142.5784 (143.7262)  masked_loss: 1.5533 (1.5260)  tag_loss: 140.9010 (142.2003)  time: 1.4336 (1.8197)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4285 (1.8146)  save_time: 8.8805 (16.9902)  lr: 0.000035  max mem: 26307
2022-03-17 01:30:27,855.855 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6111111044883728
2022-03-17 01:30:27,855.855 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.47756958007812
2022-03-17 01:30:27,856.856 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.08488894933942
2022-03-17 01:30:49,771.771 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02168019860982895
2022-03-17 01:30:49,772.772 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:30:49,772.772 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', 'elephant', 'walking', 'around', 'in', 'the', '[MASK]', 'on', '[MASK]', 'sunny', 'day', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:30:49,788.788 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'leg', 'sky', 'elephant', 'tail', 'grass', 'ear', 'ground', 'rock', 'water', 'forest', 'trunk', 'head', 'back', 'shadow', 'dirt', 'log', 'background', 'fence', 'pole', '[UNK]', 'zoo', 'sand', 'foot', 'field', 'fountain', 'pipe', 'roof', 'pool', 'eye', 'enclosure', 'bush', 'tank', 'boulder', 'cloud', 'barrel', 'large', 'road', 'puddle', 'structure', 'building', 'stick', 'area', 'person', 'animal', 'body', 'wall', 'bird', 'waterfall', 'mountain']
2022-03-17 01:31:05,640.640 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'back', 'head', 'day', 'water', 'ground', 'rock', 'forest', 'tree', 'box', 'sky', 'walking', 'leg', 'background', 'ear', 'shadow', 'grass', 'tail', 'bush', 'stick', 'pole', 'dirt', 'fence', 'zoo', 'elephant', 'sunny']
2022-03-17 01:33:29,513.513 2829:trainer.py:487 do_train_dict(): eta: 10:56:52  iter: 43500  speed: 281.3 images/sec  total_norm: 146.1836 (147.8293)  loss: 143.3249 (142.5361)  masked_loss: 1.5377 (1.5199)  tag_loss: 142.1102 (141.0163)  time: 1.4332 (1.8202)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4283 (1.8151)  save_time: 8.8805 (16.9902)  lr: 0.000035  max mem: 26307
2022-03-17 01:33:29,873.873 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.42424243688583374
2022-03-17 01:33:29,873.873 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 118.48931884765625
2022-03-17 01:33:29,873.873 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.0874801688238
2022-03-17 01:33:51,953.953 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021685268729925156
2022-03-17 01:33:51,953.953 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:33:51,953.953 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'don', '[MASK]', 'are', 'inside', 'of', 'a', 'foil', '##ed', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:33:51,969.969 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['foil', '[UNK]', 'sandwich', 'reflection', 'paper', 'table', 'person', 'pastry', 'light', 'hand', 'leaf', 'bread', 'aluminum', 'food', 'tin', 'hot', 'bun', 'stem', 'hole', 'next', 'close', 'white', 'onion', 'top', 'half', 'tomato', 'finger', 'green', 'other', 'plate', 'plastic', 'arm', 'head', 'cheese', 'dog', 'end', 'glass', 'cup', 'sleeve', 'small', 'wall', 'large', 'logo', 'hamburger', 'couple', 'ice', 'shoe', 'meat', 'piece', 'ready']
2022-03-17 01:34:07,929.929 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'paper', 'reflection', 'sandwich', 'container', 'foil', 'tomato', 'pastry']
2022-03-17 01:36:31,863.863 2829:trainer.py:487 do_train_dict(): eta: 10:54:06  iter: 43600  speed: 280.8 images/sec  total_norm: 146.7132 (149.1552)  loss: 141.0429 (142.9773)  masked_loss: 1.4744 (1.5146)  tag_loss: 139.4549 (141.4628)  time: 1.4339 (1.8236)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0050)  time_gpu: 1.4285 (1.8184)  save_time: 8.8805 (16.9902)  lr: 0.000034  max mem: 26307
2022-03-17 01:36:32,222.222 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5625
2022-03-17 01:36:32,223.223 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 105.09342193603516
2022-03-17 01:36:32,223.223 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.10114144296887
2022-03-17 01:36:54,277.277 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021676894277334213
2022-03-17 01:36:54,277.277 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:36:54,278.278 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'person', 'in', '[MASK]', 'boots', ',', 'on', 'ski', '##is', 'with', '[MASK]', 'ski', 'poles', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:36:54,293.293 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['ski', 'boot', 'snow', 'pole', '[UNK]', 'ground', 'shadow', 'person', 'track', 'leg', 'strap', 'glove', 'skier', 'foot', 'handle', 'sky', 'jacket', 'hand', 'stripe', 'line', 'man', 'coat', 'slope', 'snowy', 'tree', 'pair', 'red', 'tag', 'poles', 'trail', 'face', 'orange', 'back', 'backpack', 'woman', 'hill', 'couple', 'country', 'top', 'hat', 'gear', 'other', 'shirt', 'design', 'side', 'footprint', 'equipment', 'way', 'scarf', 'sunglasses']
2022-03-17 01:37:10,154.154 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'line', 'ground', 'track', 'person', 'leg', 'snow', 'shadow', 'pole', 'ski', 'boot', 'strap']
2022-03-17 01:39:33,958.958 2829:trainer.py:487 do_train_dict(): eta: 10:51:19  iter: 43700  speed: 281.2 images/sec  total_norm: 146.5332 (148.3827)  loss: 144.1066 (143.8117)  masked_loss: 1.4286 (1.4971)  tag_loss: 142.9181 (142.3146)  time: 1.4330 (1.8209)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4277 (1.8158)  save_time: 8.8805 (16.9902)  lr: 0.000034  max mem: 26307
2022-03-17 01:39:34,319.319 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-17 01:39:34,320.320 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 131.47254943847656
2022-03-17 01:39:34,320.320 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.10382006048611
2022-03-17 01:39:56,492.492 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021680761128664017
2022-03-17 01:39:56,492.492 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:39:56,493.493 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'little', 'girl', 'wearing', 'a', 'helmet', 'and', 'holding', 'a', '[MASK]', 'bat', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:39:56,508.508 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'helmet', 'bat', 'person', 'grass', 'hand', 'eye', 'girl', 'short', 'child', 'baseball', 'man', 'fence', 'boy', 'shoe', 'letter', 'nose', '[UNK]', 'sock', 'hair', 'face', 'handle', 'arm', 'jersey', 'tree', 'jean', 'head', 'dirt', 'hat', 'field', 'woman', 'leg', 'young', 'little', 'spectator', 'logo', 'ground', 'writing', 'necklace', 'number', 'kid', 'pole', 'strap', 'stripe', 'mouth', 'chair', 'cap', 'ball', 'swing', 'bracelet']
2022-03-17 01:40:12,483.483 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'little', 'short', 'field', 'ground', 'girl', 'person', 'child', 'arm', 'boy', 'eye', 'baseball', 'letter', 'shirt', 'jersey', 'leg', 'handle', 'grass', 'hat', 'pole', 'bat', 'fence', 'helmet', 'shoe', 'strap', 'spectator', 'sock']
2022-03-17 01:42:36,389.389 2829:trainer.py:487 do_train_dict(): eta: 10:48:32  iter: 43800  speed: 280.7 images/sec  total_norm: 143.8127 (149.4073)  loss: 139.7323 (141.3447)  masked_loss: 1.3922 (1.4657)  tag_loss: 137.9831 (139.8790)  time: 1.4331 (1.8243)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4278 (1.8190)  save_time: 8.8805 (16.9902)  lr: 0.000034  max mem: 26307
2022-03-17 01:42:36,749.749 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-17 01:42:36,750.750 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 119.12374114990234
2022-03-17 01:42:36,750.750 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.11166987495162
2022-03-17 01:42:59,095.095 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0216783806681633
2022-03-17 01:42:59,095.095 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:42:59,096.096 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'boat', 'in', 'the', '[MASK]', 'with', 'several', 'people', 'on', 'it', 'with', 'a', 'flag', 'on', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:42:59,111.111 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['water', 'person', 'boat', 'man', 'flag', 'sky', 'wave', 'reflection', 'ripple', 'wake', 'pole', 'jacket', 'ocean', 'motor', 'shadow', '[UNK]', 'head', 'door', 'small', 'raft', 'shirt', 'mountain', 'large', 'engine', 'snow', 'front', 'railing', 'hat', 'antenna', 'splash', 'seat', 'white', 'group', 'couple', 'speed', 'clear', 'fishing', 'stripe', 'top', 'ski', 'hair', 'blue', 'tire', 'open', 'american', 'vest', 'day', 'middle', 'ship', 'cabin']
03-17 01:43:09.635 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 01:43:09.635 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 01:43:10.304 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 9}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}]
2022-03-17 01:43:15,097.097 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'several', 'water', 'person', 'sky', 'boat', 'ocean', 'wave', 'motor', 'flag', 'wake', 'pole', 'jacket', 'reflection', 'ripple']
2022-03-17 01:45:38,723.723 2829:trainer.py:487 do_train_dict(): eta: 10:45:46  iter: 43900  speed: 280.8 images/sec  total_norm: 144.3893 (146.5414)  loss: 138.6804 (140.2232)  masked_loss: 1.4478 (1.4852)  tag_loss: 137.3409 (138.7380)  time: 1.4340 (1.8233)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4287 (1.8181)  save_time: 8.8805 (16.9902)  lr: 0.000034  max mem: 26307
2022-03-17 01:45:39,084.084 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5151515007019043
2022-03-17 01:45:39,084.084 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 114.24961853027344
2022-03-17 01:45:39,085.085 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.1247149814259
2022-03-17 01:46:01,268.268 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021682890132069588
2022-03-17 01:46:01,269.269 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:46:01,269.269 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'zebra', 'has', 'stuck', 'its', 'head', 'into', 'a', '[MASK]', 'with', 'a', 'passenger', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:46:01,284.284 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['car', 'head', 'tree', 'sky', 'nose', 'ear', 'door', 'eye', '[UNK]', 'jean', 'zebra', 'face', 'mouth', 'windshield', 'grass', 'person', 'shirt', 'hair', 'window', 'muzzle', 'neck', 'jacket', 'fence', 'hand', 'button', 'mirror', 'man', 'vent', 'dashboard', 'bag', 'leg', 'mane', 'seat', 'arm', 'handle', 'ground', 'camera', 'stripe', 'light', 'roof', 'pole', 'next', 'front', 'wheel', 'steering', 'black', 'driver', 'close', 'wall', 'side']
2022-03-17 01:46:17,114.114 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'door', 'car', 'hair', 'mouth', 'person', 'eye', 'tree', 'sky', 'jean', 'shirt', 'nose', 'bag', 'passenger', 'shoe', 'vent', 'windshield', 'zebra']
2022-03-17 01:48:41,171.171 2829:trainer.py:487 do_train_dict(): eta: 10:42:59  iter: 44000  speed: 280.6 images/sec  total_norm: 147.5025 (150.8654)  loss: 140.0975 (141.8490)  masked_loss: 1.5035 (1.4859)  tag_loss: 138.3601 (140.3631)  time: 1.4320 (1.8245)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4267 (1.8192)  save_time: 8.8805 (16.9902)  lr: 0.000034  max mem: 26307
2022-03-17 01:48:41,533.533 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-17 01:48:41,534.534 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 167.94195556640625
2022-03-17 01:48:41,534.534 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.12121427194332
2022-03-17 01:49:03,951.951 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021646689623594284
2022-03-17 01:49:03,951.951 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:49:03,952.952 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'carolyn', 'the', 'floor', 'holding', 'a', 'ra', '##c', '##quet', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:49:03,967.967 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'leg', 'sock', 'hand', 'short', 'man', '[UNK]', 'face', 'shoe', 'tennis', 'head', 'arm', 'ear', 'background', 'nose', 'hair', 'eye', 'stripe', 'logo', 'mouth', 'ball', 'handle', 'ground', 'band', 'knee', 'sleeve', 'shadow', 'player', 'court', 'wall', 'string', 'line', 'beard', 'white', 'male', 'floor', 'glasses', 'letter', 'collar', 'design', 'photo', 'red', 'match', 'finger', 'air', 'neck', 'wrist', 'grass', 'hat', 'net']
2022-03-17 01:49:19,898.898 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'band', 'short', 'ground', 'hair', 'mouth', 'floor', 'wall', 'arm', 'eye', 'shirt', 'leg', 'background', 'nose', 'ear', 'handle', 'tennis', 'string', 'knee', 'shoe', 'stripe', 'sock']
2022-03-17 01:51:43,559.559 2829:trainer.py:487 do_train_dict(): eta: 10:40:12  iter: 44100  speed: 280.7 images/sec  total_norm: 146.6852 (150.8122)  loss: 139.5549 (140.8831)  masked_loss: 1.4715 (1.5170)  tag_loss: 138.4529 (139.3660)  time: 1.4338 (1.8239)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4286 (1.8188)  save_time: 8.8805 (16.9902)  lr: 0.000034  max mem: 26307
2022-03-17 01:51:43,921.921 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-17 01:51:43,921.921 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 121.39515686035156
2022-03-17 01:51:43,921.921 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.12570458839382
2022-03-17 01:52:06,042.042 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02166532538831234
2022-03-17 01:52:06,043.043 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:52:06,043.043 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'zebra', '[MASK]', 'foraging', 'in', 'the', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:52:06,059.059 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['zebra', 'shadow', 'leg', 'ground', 'mane', 'grass', 'head', '[UNK]', 'ear', 'stripe', 'neck', 'tail', 'eye', 'dirt', 'nose', 'mouth', 'rock', 'fence', 'tree', 'trunk', 'field', 'body', 'log', 'bush', 'background', 'spot', 'water', 'pole', 'branch', 'wall', 'hay', 'other', 'reflection', 'bird', 'back', 'couple', 'next', 'area', 'mesh', 'group', 'leaf', 'grazing', 'shade', 'plant', 'food', 'line', 'zoo', 'post', 'hill', 'grassy']
2022-03-17 01:52:21,969.969 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'group', 'ground', 'mouth', 'eye', 'neck', 'tree', 'leg', 'ear', 'shadow', 'grass', 'tail', 'dirt', 'fence', 'log', 'stripe', 'mane', 'zebra']
2022-03-17 01:54:46,085.085 2829:trainer.py:487 do_train_dict(): eta: 10:37:25  iter: 44200  speed: 280.5 images/sec  total_norm: 147.2032 (152.4778)  loss: 143.3514 (144.0022)  masked_loss: 1.5084 (1.5121)  tag_loss: 141.9420 (142.4901)  time: 1.4341 (1.8252)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4290 (1.8200)  save_time: 8.8805 (16.9902)  lr: 0.000033  max mem: 26307
2022-03-17 01:54:46,447.447 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-17 01:54:46,447.447 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.79379272460938
2022-03-17 01:54:46,447.447 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.12467487240484
2022-03-17 01:55:08,871.871 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021704429760575294
2022-03-17 01:55:08,871.871 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:55:08,872.872 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bike', 'is', 'tied', 'to', 'a', '[MASK]', 'while', 'a', 'su', '##fer', 'walks', 'on', 'the', '[MASK]', '[MASK]', 'the', 'water', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:55:08,887.887 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'water', 'cloud', 'pole', 'bicycle', '[UNK]', 'bike', 'wave', 'ocean', 'bird', 'man', 'board', 'beach', 'wheel', 'person', 'sand', 'hair', 'tire', 'horizon', 'wing', 'short', 'boy', 'head', 'shirt', 'surf', 'arm', 'woman', 'rock', 'kite', 'basket', 'shore', 'leg', 'seat', 'post', 'bag', 'shadow', 'boat', 'top', 'hat', 'paddle', 'footprint', 'body', 'suit', 'foot', 'back', 'pedal', 'child', 'handle', 'hand', 'ground']
2022-03-17 01:55:24,831.831 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'water', 'board', 'person', 'seat', 'arm', 'boy', 'wing', 'beach', 'sky', 'shirt', 'ocean', 'leg', 'wave', 'bird', 'wheel', 'sand', 'cloud', 'pole', 'walks', 'bike', 'bicycle', 'pedal']
2022-03-17 01:57:48,699.699 2829:trainer.py:487 do_train_dict(): eta: 10:34:38  iter: 44300  speed: 280.4 images/sec  total_norm: 148.8967 (150.1477)  loss: 139.6569 (141.2873)  masked_loss: 1.4802 (1.4573)  tag_loss: 138.4310 (139.8299)  time: 1.4327 (1.8262)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4275 (1.8206)  save_time: 8.8805 (16.9902)  lr: 0.000033  max mem: 26307
2022-03-17 01:57:49,060.060 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-17 01:57:49,061.061 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 133.86709594726562
2022-03-17 01:57:49,061.061 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.13292645119331
2022-03-17 01:58:11,573.573 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02169646881520748
2022-03-17 01:58:11,573.573 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 01:58:11,574.574 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'picture', 'of', 'a', 'steep', '##le', 'and', 'clocks', '[MASK]', 'a', 'building', 'with', 'a', 'snow', '[MASK]', 'roof', 'with', 'a', 'neon', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 01:58:11,589.589 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['clock', 'building', 'tower', 'sky', 'window', 'roof', 'sign', 'light', '[UNK]', 'top', 'snow', 'ground', 'tree', 'hill', 'church', 'hand', 'large', 'pole', 'weather', 'spire', 'wall', 'bird', 'lamp', 'night', 'bush', 'bus', 'tall', 'chimney', 'house', 'cross', 'vane', 'front', 'car', 'dome', 'street', 'door', 'white', 'triangle', 'rock', 'traffic', 'truck', 'road', 'big', 'old', 'middle', 'towering', 'statue', 'train', 'side', 'snowy']
2022-03-17 01:58:27,532.532 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'building', 'light', 'distance', 'window', 'tree', 'tower', 'branch', 'sign', 'sky', 'roof', 'snow', 'clock', 'neon', 'chimney']
2022-03-17 02:00:51,147.147 2829:trainer.py:487 do_train_dict(): eta: 10:31:51  iter: 44400  speed: 280.6 images/sec  total_norm: 146.3768 (148.8367)  loss: 144.4569 (144.4037)  masked_loss: 1.5678 (1.5769)  tag_loss: 142.9069 (142.8268)  time: 1.4327 (1.8245)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4277 (1.8193)  save_time: 8.8805 (16.9902)  lr: 0.000033  max mem: 26307
2022-03-17 02:00:51,508.508 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4864864945411682
2022-03-17 02:00:51,508.508 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 135.26873779296875
2022-03-17 02:00:51,509.509 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.12974012567756
2022-03-17 02:01:14,037.037 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021763863041996956
2022-03-17 02:01:14,037.037 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:01:14,038.038 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'red', '[MASK]', '[MASK]', 'a', 'cemetery', 'beside', 'a', 'forest', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:01:14,053.053 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sign', 'tree', 'pole', 'stop', 'letter', 'ground', 'trunk', 'leaf', 'branch', '[UNK]', 'red', 'road', 'bolt', 'fence', 'weed', 'sky', 'grass', 'writing', 'arrow', 'wood', 'plant', 'bush', 'forest', 'post', 'next', 'street', 'dirt', 'wire', 'front', 'corner', 'graffiti', 'way', 'base', 'area', 'line', 'window', 'white', 'side', 'car', 'wooded', 'bench', 'name', 'rock', 'box', 'green', 'word', 'path', 'back', 'light', 'roof']
2022-03-17 02:01:29,991.991 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'red', 'ground', 'stop', 'forest', 'tree', 'letter', 'sign', 'cemetery', 'grass', 'pole', 'leaf', 'trunk']
2022-03-17 02:03:53,621.621 2829:trainer.py:487 do_train_dict(): eta: 10:29:04  iter: 44500  speed: 280.6 images/sec  total_norm: 145.3201 (148.7217)  loss: 142.4798 (143.7655)  masked_loss: 1.4907 (1.5357)  tag_loss: 140.6557 (142.2298)  time: 1.4331 (1.8248)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4280 (1.8196)  save_time: 8.8805 (16.9902)  lr: 0.000033  max mem: 26307
2022-03-17 02:03:53,982.982 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.47058823704719543
2022-03-17 02:03:53,983.983 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.45477294921875
2022-03-17 02:03:53,983.983 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.13863661043312
2022-03-17 02:04:16,478.478 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02181190438568592
2022-03-17 02:04:16,478.478 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:04:16,479.479 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'little', 'boy', 'in', 'a', 'car', '[MASK]', 'a', 'sandwich', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:04:16,494.494 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'car', 'tree', '[UNK]', 'person', 'mouth', 'eye', 'bread', 'jacket', 'seat', 'face', 'shirt', 'nose', 'windshield', 'head', 'finger', 'sky', 'hand', 'food', 'ear', 'arm', 'chair', 'button', 'table', 'sleeve', 'dog', 'vehicle', 'man', 'thumb', 'building', 'roof', 'hair', 'banana', 'logo', 'sandwich', 'door', 'hat', 'collar', 'road', 'handle', 'bun', 'paper', 'small', 'curtain', 'light', 'bus', 'wall', 'bear', 'label', 'top']
2022-03-17 02:04:32,481.481 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'hand', 'little', 'face', 'car', 'hair', 'mouth', 'seat', 'boy', 'eye', 'window', 'tree', 'shirt', 'finger', 'nose', 'ear', 'pocket', 'tag', 'button', 'jacket', 'bread', 'sleeve', 'sandwich', 'strap']
2022-03-17 02:06:56,275.275 2829:trainer.py:487 do_train_dict(): eta: 10:26:17  iter: 44600  speed: 280.3 images/sec  total_norm: 147.1895 (150.9362)  loss: 143.2369 (143.5238)  masked_loss: 1.5019 (1.5042)  tag_loss: 142.0414 (142.0195)  time: 1.4331 (1.8264)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4281 (1.8213)  save_time: 8.8805 (16.9902)  lr: 0.000033  max mem: 26307
2022-03-17 02:06:56,638.638 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-17 02:06:56,638.638 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 138.19956970214844
2022-03-17 02:06:56,638.638 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.14188735757098
2022-03-17 02:07:19,088.088 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02181151881814003
2022-03-17 02:07:19,089.089 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:07:19,089.089 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'riding', 'a', 'motorcycle', 'next', '[MASK]', 'a', 'red', 'city', 'bus', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:07:19,105.105 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'road', 'helmet', 'jacket', 'motorcycle', 'tire', 'plate', 'street', 'bus', 'license', 'sidewalk', 'bike', 'window', 'curb', 'line', 'officer', 'person', 'boot', 'police', 'pole', '[UNK]', 'light', 'shoe', 'wheel', 'door', 'sign', 'head', 'arrow', 'stripe', 'vehicle', 'shadow', 'safety', 'glove', 'bicycle', 'woman', 'city', 'handle', 'bag', 'policeman', 'car', 'back', 'wall', 'van', 'traffic', 'tree', 'tail', 'pipe', 'building', 'hair', 'mirror']
2022-03-17 02:07:35,136.136 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['city', 'man', 'line', 'next', 'road', 'street', 'red', 'light', 'police', 'person', 'officer', 'window', 'shirt', 'bus', 'truck', 'plate', 'wheel', 'hat', 'license', 'cap', 'pole', 'jacket', 'bike', 'cop', 'motorcycle', 'boot', 'helmet', 'shoe', 'sidewalk', 'tire', 'curb']
2022-03-17 02:09:58,959.959 2829:trainer.py:487 do_train_dict(): eta: 10:23:29  iter: 44700  speed: 280.3 images/sec  total_norm: 147.8476 (150.8522)  loss: 140.2339 (142.5799)  masked_loss: 1.4206 (1.4493)  tag_loss: 138.4982 (141.1306)  time: 1.4340 (1.8269)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4288 (1.8218)  save_time: 8.8805 (16.9902)  lr: 0.000033  max mem: 26307
2022-03-17 02:09:59,319.319 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-17 02:09:59,319.319 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 131.57757568359375
2022-03-17 02:09:59,319.319 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.14805282013756
2022-03-17 02:10:21,832.832 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021816806867718697
2022-03-17 02:10:21,832.832 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:10:21,833.833 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'three', 'pastry', 'dessert', '##s', '[MASK]', 'two', 'glasses', 'of', 'wine', 'are', 'on', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:10:21,848.848 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['plate', 'table', 'light', 'dessert', 'fork', 'cake', '[UNK]', 'food', 'paper', 'car', 'sauce', 'napkin', 'topping', 'background', 'cream', 'piece', 'crust', 'whipped', 'mushroom', 'spoon', 'cup', 'pie', 'restaurant', 'chocolate', 'bread', 'handle', 'newspaper', 'coffee', 'shadow', 'glass', 'bowl', 'reflection', 'layer', 'window', 'slice', 'person', 'pizza', 'dish', 'meat', 'base', 'white', 'olive', 'knife', 'menu', 'bottle', 'ball', 'stem', 'ice', 'wine', 'delicious']
2022-03-17 02:10:37,758.758 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'light', 'car', 'person', 'table', 'base', 'glass', 'paper', 'shirt', 'wine', 'plate', 'shadow', 'knife', 'meat', 'bread', 'stem', 'fork', 'dish', 'dessert', 'crust', 'napkin', 'topping', 'pastry']
2022-03-17 02:13:01,687.687 2829:trainer.py:487 do_train_dict(): eta: 10:20:42  iter: 44800  speed: 280.2 images/sec  total_norm: 149.9937 (153.1639)  loss: 139.3013 (139.9245)  masked_loss: 1.4402 (1.4530)  tag_loss: 137.5764 (138.4716)  time: 1.4335 (1.8273)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4283 (1.8221)  save_time: 8.8805 (16.9902)  lr: 0.000033  max mem: 26307
2022-03-17 02:13:02,047.047 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6969696879386902
2022-03-17 02:13:02,047.047 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 123.20221710205078
2022-03-17 02:13:02,047.047 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.16377372996578
03-17 02:13:10.405 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 02:13:10.405 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 02:13:11.103 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}]
2022-03-17 02:13:24,578.578 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021821388974785805
2022-03-17 02:13:24,579.579 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:13:24,579.579 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'photograph', '[MASK]', 'a', 'produce', 'stand', 'in', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:13:24,594.594 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['banana', '[UNK]', 'vegetable', 'market', 'shirt', 'person', 'produce', 'pole', 'carrot', 'bag', 'onion', 'table', 'crate', 'stand', 'basket', 'fruit', 'sign', 'potato', 'man', 'woman', 'apple', 'ground', 'hair', 'bunch', 'cabbage', 'mango', 'head', 'skirt', 'pepper', 'stick', 'box', 'ceiling', 'plastic', 'display', 'squash', 'umbrella', 'building', 'hand', 'shelf', 'roof', 'pumpkin', 'leaf', 'bin', 'canopy', 'hat', 'tomato', 'jean', 'window', 'garlic', 'street']
2022-03-17 02:13:40,481.481 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'ground', 'person', 'table', 'market', 'stand', 'shirt', 'produce', 'roof', 'shadow', 'plastic', 'photograph', 'potato', 'banana', 'vegetable', 'onion', 'carrot', 'pumpkin', 'crate']
2022-03-17 02:16:04,486.486 2829:trainer.py:487 do_train_dict(): eta: 10:17:55  iter: 44900  speed: 280.1 images/sec  total_norm: 147.0449 (149.0564)  loss: 136.5453 (140.1855)  masked_loss: 1.3919 (1.4591)  tag_loss: 135.0207 (138.7264)  time: 1.4326 (1.8280)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4273 (1.8229)  save_time: 8.8805 (16.9902)  lr: 0.000032  max mem: 26307
2022-03-17 02:16:04,848.848 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7142857313156128
2022-03-17 02:16:04,848.848 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.12722778320312
2022-03-17 02:16:04,848.848 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.16637097676595
2022-03-17 02:16:27,367.367 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021904485300183296
2022-03-17 02:16:27,368.368 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:16:27,368.368 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'dimly', 'lit', 'din', '##ning', 'table', 'with', 'a', 'flower', '[MASK]', '##piece', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:16:27,383.383 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', 'vase', 'flower', 'glass', 'bottle', 'wine', 'woman', 'leaf', 'shadow', 'stem', 'plant', 'label', 'hair', 'wall', 'water', '[UNK]', 'box', 'nose', 'mouth', 'face', 'shirt', 'watch', 'base', 'bowl', 'room', 'hand', 'eye', 'napkin', 'shelf', 'background', 'book', 'light', 'logo', 'person', 'glasses', 'girl', 'picture', 'head', 'lamp', 'bouquet', 'paper', 'fork', 'candle', 'jacket', 'plate', 'next', 'card', 'chair', 'bag', 'refrigerator']
2022-03-17 02:16:43,187.187 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'room', 'woman', 'hair', 'table', 'wall', 'glass', 'plant', 'box', 'shirt', 'label', 'bottom', 'nose', 'wine', 'shadow', 'lit', 'bottle', 'flower', 'leaf', 'stem', 'shelf', 'vase', 'dimly']
2022-03-17 02:19:07,469.469 2829:trainer.py:487 do_train_dict(): eta: 10:15:08  iter: 45000  speed: 279.8 images/sec  total_norm: 145.5425 (146.9968)  loss: 138.9292 (139.9383)  masked_loss: 1.4847 (1.4861)  tag_loss: 137.5657 (138.4522)  time: 1.4327 (1.8298)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0050)  time_gpu: 1.4275 (1.8246)  save_time: 8.8805 (16.9902)  lr: 0.000032  max mem: 26307
2022-03-17 02:19:07,471.471 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt
2022-03-17 02:19:16,620.620 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7272727489471436
2022-03-17 02:19:16,620.620 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 136.8945770263672
2022-03-17 02:19:16,621.621 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.17092167247425
2022-03-17 02:19:39,232.232 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02189561165869236
2022-03-17 02:19:39,233.233 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:19:39,234.234 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'several', 'bundles', '[MASK]', 'fruit', 'hanging', '[MASK]', 'a', 'plant', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:19:39,249.249 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'leaf', 'banana', 'stem', 'trunk', 'bunch', 'plant', 'flower', 'branch', 'sky', '[UNK]', 'green', 'moss', 'fruit', 'vine', 'bush', 'rock', 'large', 'building', 'fence', 'bananas', 'forest', 'top', 'ground', 'jungle', 'shadow', 'bark', 'grass', 'stalk', 'tropical', 'ripe', 'group', 'water', 'tail', 'light', 'lush', 'bottom', 'wire', 'big', 'wall', 'small', 'dirt', 'side', 'red', 'front', 'picture', 'road', 'cluster', 'area', 'fern']
2022-03-17 02:19:55,041.041 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'rock', 'plant', 'tree', 'branch', 'sky', 'hanging', 'fruit', 'flower', 'leaf', 'stem', 'trunk', 'bunch', 'banana']
2022-03-17 02:22:17,969.969 2829:trainer.py:487 do_train_dict(): eta: 10:12:24  iter: 45100  speed: 268.8 images/sec  total_norm: 147.0945 (149.3187)  loss: 139.8319 (140.7661)  masked_loss: 1.4648 (1.4864)  tag_loss: 138.1013 (139.2797)  time: 1.4329 (1.9050)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4278 (1.8121)  save_time: 8.8805 (16.0781)  lr: 0.000032  max mem: 26307
2022-03-17 02:22:18,331.331 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7575757503509521
2022-03-17 02:22:18,331.331 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 117.73262786865234
2022-03-17 02:22:18,331.331 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.18742867697657
2022-03-17 02:22:41,076.076 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02189384214580059
2022-03-17 02:22:41,077.077 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:22:41,077.077 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bathroom', 'with', 'a', '[MASK]', 'toilet', ',', 'white', '[MASK]', 'and', 'white', 'shower', 'curtain', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:22:41,093.093 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'bathroom', 'sink', 'mirror', '[UNK]', 'curtain', 'picture', 'toilet', 'handle', 'towel', 'light', 'floor', 'cabinet', 'lid', 'rug', 'frame', 'door', 'bottle', 'reflection', 'shower', 'tank', 'bowl', 'rack', 'knob', 'soap', 'tile', 'shelf', 'holder', 'drawer', 'base', 'vanity', 'white', 'dish', 'seat', 'basket', 'ring', 'ceiling', 'tub', 'switch', 'decoration', 'rod', 'fixture', 'outlet', 'table', 'box', 'plate', 'stand', 'paper', 'pipe', 'leg']
2022-03-17 02:22:57,088.088 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'white', 'door', 'light', 'table', 'wall', 'stand', 'ring', 'picture', 'frame', 'tank', 'handle', 'mirror', 'bathroom', 'shower', 'switch', 'sink', 'reflection', 'towel', 'curtain', 'toilet', 'lid', 'tub', 'magnet', 'knob', 'rug']
2022-03-17 02:25:21,112.112 2829:trainer.py:487 do_train_dict(): eta: 10:09:37  iter: 45200  speed: 279.6 images/sec  total_norm: 147.8916 (151.8833)  loss: 139.5687 (139.2047)  masked_loss: 1.4275 (1.4417)  tag_loss: 138.2020 (137.7629)  time: 1.4343 (1.8314)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4290 (1.8262)  save_time: 8.8805 (16.0781)  lr: 0.000032  max mem: 26307
2022-03-17 02:25:21,472.472 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7272727489471436
2022-03-17 02:25:21,473.473 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 127.7848129272461
2022-03-17 02:25:21,473.473 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.19178291135539
2022-03-17 02:25:44,156.156 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02191152796149254
2022-03-17 02:25:44,157.157 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:25:44,157.157 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bowl', 'full', 'of', 'multi', 'colored', 'pasta', 'and', '[MASK]', '##coll', '##i', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:25:44,172.172 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'table', 'bowl', 'food', 'pasta', 'shrimp', 'plate', 'rice', 'chicken', 'handle', 'cup', 'tomato', 'spoon', 'salad', 'meat', 'dish', 'pepper', 'fork', 'vegetable', 'pea', 'knife', 'carrot', 'onion', 'corn', 'stem', 'meal', 'container', 'white', 'flower', 'lid', 'mushroom', 'full', 'red', 'cheese', 'fry', 'colorful', 'top', 'blue', 'glass', 'wooden', 'close', 'olive', 'pizza', 'side', 'sausage', 'can', 'lemon', 'mixed', 'large', 'next']
2022-03-17 02:26:00,096.096 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'full', 'cup', 'table', 'food', 'bowl', 'multi', 'handle', 'plate', 'rice', 'chicken', 'fork', 'vegetable', 'spoon', 'shrimp', 'pasta']
2022-03-17 02:28:24,316.316 2829:trainer.py:487 do_train_dict(): eta: 10:06:49  iter: 45300  speed: 279.5 images/sec  total_norm: 145.3128 (147.4635)  loss: 142.8366 (141.5086)  masked_loss: 1.3910 (1.4478)  tag_loss: 141.4757 (140.0607)  time: 1.4335 (1.8320)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4279 (1.8268)  save_time: 8.8805 (16.0781)  lr: 0.000032  max mem: 26307
2022-03-17 02:28:24,677.677 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-17 02:28:24,677.677 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 177.65623474121094
2022-03-17 02:28:24,677.677 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.1881763861568
2022-03-17 02:28:47,621.621 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021909112110733986
2022-03-17 02:28:47,621.621 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:28:47,621.621 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'and', 'green', 'train', 'is', '##nesia', 'into', 'a', 'station', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:28:47,638.638 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'train', 'sky', 'bridge', 'number', '[UNK]', 'line', 'light', 'track', 'platform', 'front', 'windshield', 'door', 'sign', 'beam', 'building', 'car', 'yellow', 'pole', 'wire', 'station', 'fence', 'green', 'bumper', 'writing', 'wall', 'roof', 'puddle', 'sidewalk', 'walkway', 'shadow', 'tree', 'cloud', 'engine', 'handle', 'street', 'stripe', 'road', 'logo', 'gravel', 'plate', 'ground', 'water', 'blue', 'letter', 'railing', 'traffic', 'ladder', 'bush', 'pavement']
2022-03-17 02:29:03,511.511 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['number', 'line', 'station', 'front', 'light', 'green', 'bridge', 'mountain', 'window', 'train', 'sign', 'sky', 'yellow', 'text', 'bus', 'traffic', 'platform', 'handle', 'plate', 'license', 'pole', 'beam', 'ladder', 'stripe', 'puddle']
2022-03-17 02:31:27,235.235 2829:trainer.py:487 do_train_dict(): eta: 10:04:02  iter: 45400  speed: 279.9 images/sec  total_norm: 146.7626 (149.5498)  loss: 142.2525 (143.1962)  masked_loss: 1.4364 (1.4794)  tag_loss: 140.6587 (141.7168)  time: 1.4327 (1.8292)  data: 0.0001 (0.0005)  to_device: 0.0049 (0.0049)  time_gpu: 1.4279 (1.8238)  save_time: 8.8805 (16.0781)  lr: 0.000032  max mem: 26307
2022-03-17 02:31:27,597.597 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-17 02:31:27,598.598 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.1650390625
2022-03-17 02:31:27,598.598 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.19887241740803
2022-03-17 02:31:50,536.536 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021944560110569
2022-03-17 02:31:50,537.537 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:31:50,537.537 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bunch', 'of', '[MASK]', 'that', 'have', 'some', '[MASK]', 'in', 'it', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:31:50,553.553 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['boat', 'shirt', 'hair', 'man', 'hand', 'container', 'rope', 'food', 'water', 'bowl', 'pole', 'head', '[UNK]', 'basket', 'bin', 'boy', 'fish', 'person', 'arm', 'sauce', 'meat', 'paddle', 'bag', 'bucket', 'handle', 'ear', 'tire', 'tray', 'stripe', 'box', 'carrot', 'jean', 'vegetable', 'knife', 'dish', 'small', 'wall', 'foot', 'cloth', 'plastic', 'mirror', 'lid', 'jacket', 'something', 'other', 'dock', 'pizza', 'woman', 'bottle', 'young']
2022-03-17 02:32:06,456.456 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'water', 'hair', 'food', 'arm', 'shirt', 'fish', 'boat', 'bowl', 'meat', 'pole', 'bin', 'rope', 'brush', 'bunch', 'basket', 'container', 'tire', 'sauce', 'bowls', 'paddle']
2022-03-17 02:34:30,209.209 2829:trainer.py:487 do_train_dict(): eta: 10:01:14  iter: 45500  speed: 279.8 images/sec  total_norm: 146.0719 (149.9861)  loss: 139.7415 (139.5959)  masked_loss: 1.5358 (1.5258)  tag_loss: 137.8792 (138.0701)  time: 1.4325 (1.8297)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4273 (1.8246)  save_time: 8.8805 (16.0781)  lr: 0.000031  max mem: 26307
2022-03-17 02:34:30,570.570 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7941176295280457
2022-03-17 02:34:30,570.570 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 165.72402954101562
2022-03-17 02:34:30,570.570 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.2002185269406
2022-03-17 02:34:53,718.718 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021948710083961487
2022-03-17 02:34:53,719.719 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:34:53,719.719 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'kitchen', 'has', 'a', 'free', 'standing', 'counter', 'and', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:34:53,734.734 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', 'wall', 'chair', '[UNK]', 'bottle', 'apple', 'bowl', 'clock', 'window', 'tray', 'orange', 'fruit', 'shelf', 'cup', 'room', 'bucket', 'mat', 'drawer', 'handle', 'container', 'lid', 'can', 'stand', 'paper', 'kettle', 'cabinet', 'mirror', 'curtain', 'towel', 'cloth', 'mug', 'light', 'knife', 'plate', 'basket', 'pen', 'kitchen', 'pot', 'coffee', 'knob', 'microwave', 'maker', 'item', 'bag', 'dining', 'label', 'ceiling', 'pitcher', 'box', 'desk']
2022-03-17 02:35:09,700.700 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'room', 'book', 'light', 'cup', 'free', 'radio', 'table', 'wall', 'standing', 'chair', 'plant', 'window', 'kitchen', 'label', 'coffee', 'wine', 'orange', 'bowl', 'counter', 'frame', 'clock', 'mirror', 'bottle', 'fruit', 'apple', 'flower', 'pen', 'cloth', 'item', 'pot', 'maker', 'dish', 'shelf', 'container', 'tray', 'marker', 'drawer', 'mat', 'bucket', 'jar', 'vase']
2022-03-17 02:37:33,631.631 2829:trainer.py:487 do_train_dict(): eta: 9:58:27  iter: 45600  speed: 279.1 images/sec  total_norm: 146.8478 (151.2619)  loss: 137.7772 (141.0801)  masked_loss: 1.4549 (1.4667)  tag_loss: 136.3211 (139.6134)  time: 1.4336 (1.8342)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4284 (1.8291)  save_time: 8.8805 (16.0781)  lr: 0.000031  max mem: 26307
2022-03-17 02:37:33,991.991 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-17 02:37:33,991.991 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 140.31912231445312
2022-03-17 02:37:33,991.991 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.20183854551753
2022-03-17 02:37:56,956.956 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021926330402493477
2022-03-17 02:37:56,957.957 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:37:56,957.957 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'there', 'is', 'a', 'brown', 'couch', 'and', 'white', 'curtains', 'in', 'this', 'living', 'room', '[MASK]', 'are', '[MASK]', 'stairs', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:37:56,972.972 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'room', 'couch', 'floor', 'wall', 'table', 'door', 'curtain', 'chair', 'sofa', 'pillow', 'living', 'ceiling', 'balcony', 'carpet', 'furniture', 'mirror', 'coffee', 'stair', 'cabinet', 'lamp', 'light', 'circle', 'rug', 'glass', '[UNK]', 'ottoman', 'shade', 'design', 'large', 'rod', 'vent', 'staircase', 'handle', 'cushion', 'book', 'television', 'bowl', 'shelf', 'doorway', 'plant', 'picture', 'railing', 'building', 'switch', 'stand', 'area', 'top', 'step', 'stool']
2022-03-17 02:38:12,831.831 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['room', 'white', 'door', 'living', 'television', 'floor', 'table', 'wall', 'brown', 'chair', 'plant', 'window', 'coffee', 'cabinet', 'bottle', 'ottoman', 'couch', 'hook', 'pillow', 'sofa', 'staircase', 'curtain', 'balcony', 'tray', 'railing', 'dresser', 'cushion', 'stair']
2022-03-17 02:40:36,741.741 2829:trainer.py:487 do_train_dict(): eta: 9:55:39  iter: 45700  speed: 279.6 images/sec  total_norm: 146.2788 (149.3616)  loss: 140.8254 (142.2374)  masked_loss: 1.5306 (1.5512)  tag_loss: 139.0315 (140.6861)  time: 1.4337 (1.8311)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4283 (1.8259)  save_time: 8.8805 (16.0781)  lr: 0.000031  max mem: 26307
2022-03-17 02:40:37,102.102 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-17 02:40:37,102.102 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 137.29315185546875
2022-03-17 02:40:37,103.103 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.20614439951801
2022-03-17 02:41:00,247.247 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02196500077843666
2022-03-17 02:41:00,247.247 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:41:00,247.247 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'sheep', 'crowd', 'the', 'street', 'in', 'a', '[MASK]', 'near', '[MASK]', 'and', 'trees', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:41:00,263.263 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'window', 'ground', 'roof', 'sky', 'rock', 'sheep', 'snow', 'house', 'person', 'door', 'pole', 'mountain', 'old', 'wall', 'picture', 'horse', 'hill', 'tree', 'head', 'cloud', 'hat', 'barn', 'animal', '[UNK]', 'white', 'man', 'road', 'photo', 'cow', 'doorway', 'chimney', 'wheel', 'herd', 'coat', 'sign', 'car', 'front', 'light', 'pipe', 'street', 'fence', 'town', 'hillside', 'black', 'large', 'goat', 'carriage', 'cabin', 'snowy']
2022-03-17 02:41:16,169.169 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'house', 'town', 'building', 'road', 'street', 'ground', 'rock', 'mountain', 'window', 'tree', 'horse', 'sky', 'picture', 'animal', 'roof', 'snow', 'symbol', 'doorway', 'trunk', 'sheep', 'cow']
03-17 02:43:11.204 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 02:43:11.204 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 02:43:12.348 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 97}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 89}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 88}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 97}]
2022-03-17 02:43:39,927.927 2829:trainer.py:487 do_train_dict(): eta: 9:52:51  iter: 45800  speed: 279.5 images/sec  total_norm: 145.6557 (148.5404)  loss: 141.4648 (141.0630)  masked_loss: 1.4525 (1.4922)  tag_loss: 139.9415 (139.5707)  time: 1.4319 (1.8319)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4272 (1.8269)  save_time: 8.8805 (16.0781)  lr: 0.000031  max mem: 26307
2022-03-17 02:43:40,291.291 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5151515007019043
2022-03-17 02:43:40,291.291 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.31019592285156
2022-03-17 02:43:40,291.291 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.20996450287065
2022-03-17 02:44:03,368.368 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.021973075345158577
2022-03-17 02:44:03,368.368 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:44:03,369.369 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'cars', 'are', 'stopped', 'at', 'a', 'traffic', '[MASK]', 'during', 'sunset', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:44:03,384.384 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'light', 'pole', 'car', 'road', 'line', 'traffic', 'tree', 'street', 'wire', 'truck', 'bus', 'window', 'sign', '[UNK]', 'cloud', 'mirror', 'intersection', 'windshield', 'building', 'sun', 'ground', 'view', 'power', 'wall', 'tail', 'vehicle', 'lot', 'red', 'wheel', 'person', 'sunset', 'van', 'night', 'picture', 'fence', 'tire', 'back', 'sidewalk', 'side', 'grass', 'door', 'tower', 'fire', 'background', 'city', 'photo', 'busy', 'station', 'bridge']
2022-03-17 02:44:19,300.300 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['line', 'road', 'street', 'light', 'car', 'window', 'tree', 'sky', 'bus', 'traffic', 'truck', 'mirror', 'pole', 'wire', 'sunset', 'windshield']
2022-03-17 02:46:43,146.146 2829:trainer.py:487 do_train_dict(): eta: 9:50:03  iter: 45900  speed: 279.4 images/sec  total_norm: 148.6446 (151.2845)  loss: 141.1665 (140.7816)  masked_loss: 1.4903 (1.5185)  tag_loss: 139.6264 (139.2631)  time: 1.4334 (1.8322)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4286 (1.8271)  save_time: 8.8805 (16.0781)  lr: 0.000031  max mem: 26307
2022-03-17 02:46:43,509.509 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.625
2022-03-17 02:46:43,509.509 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 127.8098373413086
2022-03-17 02:46:43,510.510 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.22112145631209
2022-03-17 02:47:06,789.789 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02197863534092903
2022-03-17 02:47:06,789.789 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:47:06,790.790 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'of', 'chocolate', 'layer', 'cake', 'with', 'american', 'flag', 'planted', 'in', 'it', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:47:06,805.805 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cake', 'flag', 'table', 'spoon', '[UNK]', 'plate', 'handle', 'cloth', 'stick', 'fork', 'reflection', 'star', 'bowl', 'light', 'top', 'flower', 'piece', 'candle', 'bread', 'stripe', 'design', 'stem', 'cream', 'crust', 'glass', 'american', 'paper', 'cup', 'layer', 'water', 'dessert', 'small', 'slice', 'food', 'person', 'leaf', 'shadow', 'next', 'napkin', 'hand', 'white', 'tea', 'sugar', 'object', 'blade', 'close', 'ball', 'ice', 'bottle', 'coffee']
2022-03-17 02:47:22,691.691 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'american', 'table', 'handle', 'plate', 'flag', 'chocolate', 'reflection', 'cake', 'spoon', 'crust']
2022-03-17 02:49:46,215.215 2829:trainer.py:487 do_train_dict(): eta: 9:47:15  iter: 46000  speed: 279.7 images/sec  total_norm: 146.9878 (148.8675)  loss: 142.6540 (141.9819)  masked_loss: 1.4465 (1.4693)  tag_loss: 141.2993 (140.5126)  time: 1.4325 (1.8307)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4272 (1.8255)  save_time: 8.8805 (16.0781)  lr: 0.000031  max mem: 26307
2022-03-17 02:49:46,576.576 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6388888955116272
2022-03-17 02:49:46,576.576 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.797607421875
2022-03-17 02:49:46,576.576 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.22365999532109
2022-03-17 02:50:09,888.888 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02198127843439579
2022-03-17 02:50:09,888.888 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:50:09,889.889 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'is', 'open', 'on', 'a', 'desk', '[MASK]', 'dim', 'light', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:50:09,904.904 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['lamp', 'table', 'window', 'chair', 'wall', 'box', 'desk', 'shade', 'plate', 'keyboard', 'blind', 'room', 'floor', 'knife', 'cup', 'computer', '[UNK]', 'screen', 'paper', 'glass', 'lid', 'laptop', 'remote', 'monitor', 'light', 'bottle', 'mouse', 'water', 'can', 'napkin', 'speaker', 'pillow', 'book', 'mug', 'cd', 'printer', 'television', 'control', 'bowl', 'bag', 'car', 'guitar', 'coffee', 'cushion', 'handle', 'shirt', 'person', 'shelf', 'stack', 'phone']
2022-03-17 02:50:25,894.894 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'room', 'open', 'door', 'light', 'cup', 'floor', 'table', 'wall', 'glass', 'chair', 'paper', 'computer', 'window', 'box', 'screen', 'coffee', 'desk', 'plate', 'knife', 'bottle', 'blind', 'couch', 'remote', 'mouse', 'monitor', 'shade', 'keyboard', 'lamp', 'sofa', 'dim', 'laptop', 'rack', 'mug', 'soda', 'ledge']
2022-03-17 02:52:49,871.871 2829:trainer.py:487 do_train_dict(): eta: 9:44:28  iter: 46100  speed: 278.8 images/sec  total_norm: 147.9836 (150.9202)  loss: 139.2249 (141.8203)  masked_loss: 1.4920 (1.5302)  tag_loss: 137.3647 (140.2902)  time: 1.4340 (1.8366)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4290 (1.8315)  save_time: 8.8805 (16.0781)  lr: 0.000031  max mem: 26307
2022-03-17 02:52:50,233.233 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7941176295280457
2022-03-17 02:52:50,233.233 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 110.102294921875
2022-03-17 02:52:50,234.234 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.23229049913812
2022-03-17 02:53:13,558.558 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022016631439328194
2022-03-17 02:53:13,558.558 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:53:13,559.559 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'climbs', 'a', 'ladder', 'up', '[MASK]', 'a', 'si', '##lo', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:53:13,574.574 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'truck', 'beam', 'shirt', '[UNK]', 'pole', 'sky', 'building', 'ramp', 'ladder', 'hat', 'windshield', 'window', 'post', 'head', 'roof', 'tire', 'wheel', 'jean', 'front', 'wall', 'ground', 'structure', 'belt', 'door', 'stair', 'cab', 'snow', 'person', 'license', 'light', 'mirror', 'sign', 'van', 'railing', 'plate', 'hair', 'pillar', 'jacket', 'ceiling', 'vehicle', 'white', 'leg', 'road', 'next', 'bumper', 'camera', 'cap', 'logo', 'grass']
2022-03-17 02:53:29,416.416 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'house', 'building', 'road', 'front', 'light', 'ground', 'post', 'wall', 'window', 'jean', 'shirt', 'roof', 'truck', 'plate', 'wheel', 'belt', 'ceiling', 'hat', 'license', 'pole', 'beam', 'fence', 'pipe', 'steering', 'ladder', 'ramp', 'pillar', 'railing', 'windshield', 'bumper', 'stair']
2022-03-17 02:55:53,447.447 2829:trainer.py:487 do_train_dict(): eta: 9:41:40  iter: 46200  speed: 278.9 images/sec  total_norm: 147.2830 (150.2882)  loss: 141.3240 (141.0717)  masked_loss: 1.5054 (1.4939)  tag_loss: 139.6562 (139.5778)  time: 1.4327 (1.8357)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4274 (1.8305)  save_time: 8.8805 (16.0781)  lr: 0.000030  max mem: 26307
2022-03-17 02:55:53,812.812 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7058823704719543
2022-03-17 02:55:53,813.813 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 104.05728149414062
2022-03-17 02:55:53,813.813 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.24662524526114
2022-03-17 02:56:17,044.044 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022092510014772415
2022-03-17 02:56:17,044.044 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:56:17,044.044 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'rip', 'is', '[MASK]', 'in', 'coasts', 'with', 'batting', 'exposed', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:56:17,060.060 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['flower', 'umbrella', 'bed', 'design', 'blanket', 'banana', 'pillow', 'handle', 'yellow', 'vase', '[UNK]', 'top', 'tie', 'table', 'cloth', 'blue', 'colorful', 'white', 'paper', 'window', 'fish', 'light', 'butterfly', 'shadow', 'black', 'floral', 'ball', 'tag', 'towel', 'eye', 'bag', 'scissors', 'purple', 'small', 'leaf', 'wall', 'star', 'dot', 'bear', 'fabric', 'sheet', 'other', 'bunch', 'ear', 'material', 'pair', 'number', 'button', 'circle', 'animal']
2022-03-17 02:56:32,938.938 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['design', 'bed', 'circle', 'flower', 'fabric', 'batting', 'rip', 'umbrella', 'banana']
2022-03-17 02:58:56,846.846 2829:trainer.py:487 do_train_dict(): eta: 9:38:52  iter: 46300  speed: 279.2 images/sec  total_norm: 146.1909 (149.5816)  loss: 140.9943 (142.9112)  masked_loss: 1.4123 (1.4693)  tag_loss: 139.6194 (141.4419)  time: 1.4314 (1.8340)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4263 (1.8289)  save_time: 8.8805 (16.0781)  lr: 0.000030  max mem: 26307
2022-03-17 02:58:57,207.207 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.47058823704719543
2022-03-17 02:58:57,208.208 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 154.2015380859375
2022-03-17 02:58:57,208.208 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.24297504589475
2022-03-17 02:59:20,480.480 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022096989676356316
2022-03-17 02:59:20,480.480 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 02:59:20,480.480 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'lit', 'toast', '##er', 'but', '[MASK]', 'toast', '[MASK]', '[MASK]', 'seated', 'next', 'to', 'a', 'microwave', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 02:59:20,496.496 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'window', '[UNK]', 'microwave', 'oven', 'rack', 'knob', 'towel', 'door', 'chair', 'person', 'reflection', 'handle', 'kitchen', 'button', 'light', 'dial', 'cord', 'panel', 'table', 'glass', 'tile', 'outlet', 'plate', 'food', 'cabinet', 'bag', 'top', 'display', 'room', 'cloth', 'leg', 'counter', 'container', 'shelf', 'mirror', 'floor', 'control', 'sink', 'paper', 'bowl', 'cup', 'pot', 'box', 'tray', 'bottle', 'black', 'stove', 'hair', 'lid']
2022-03-17 02:59:36,486.486 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'next', 'door', 'light', 'person', 'wall', 'chair', 'window', 'metal', 'kitchen', 'empty', 'bag', 'lit', 'plastic', 'button', 'reflection', 'cord', 'dial', 'tile', 'rack', 'knob', 'oven', 'microwave']
2022-03-17 03:02:00,495.495 2829:trainer.py:487 do_train_dict(): eta: 9:36:04  iter: 46400  speed: 278.8 images/sec  total_norm: 149.7223 (152.0181)  loss: 144.1093 (145.2236)  masked_loss: 1.4588 (1.4851)  tag_loss: 143.1072 (143.7384)  time: 1.4341 (1.8365)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4288 (1.8313)  save_time: 8.8805 (16.0781)  lr: 0.000030  max mem: 26307
2022-03-17 03:02:00,855.855 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5405405163764954
2022-03-17 03:02:00,856.856 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 115.45228576660156
2022-03-17 03:02:00,856.856 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.25549110494634
2022-03-17 03:02:24,091.091 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02210330218076706
2022-03-17 03:02:24,091.091 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:02:24,092.092 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'girl', 'is', 'holding', '[MASK]', 'game', '[MASK]', 'above', 'her', 'head', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:02:24,107.107 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'lamp', 'hand', 'sweater', 'hair', 'jean', 'woman', '[UNK]', 'face', 'shade', 'bottle', 'table', 'chair', 'door', 'head', 'picture', 'shirt', 'arm', 'controller', 'room', 'remote', 'cup', 'couch', 'man', 'ceiling', 'mouth', 'nose', 'switch', 'glasses', 'light', 'eye', 'floor', 'ear', 'book', 'cabinet', 'pillow', 'game', 'living', 'girl', 'beard', 'window', 'plate', 'toy', 'belt', 'person', 'phone', 'box', 'shelf', 'glass', 'frame']
2022-03-17 03:02:40,014.014 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'game', 'face', 'room', 'door', 'woman', 'cup', 'heart', 'living', 'hair', 'girl', 'person', 'table', 'wall', 'character', 'chair', 'jean', 'shirt', 'picture', 'cabinet', 'bottle', 'couch', 'remote', 'switch', 'glasses', 'shade', 'beard', 'lamp', 'ribbon', 'controller', 'container', 'sweater', 'strap']
2022-03-17 03:05:03,765.765 2829:trainer.py:487 do_train_dict(): eta: 9:33:16  iter: 46500  speed: 279.4 images/sec  total_norm: 145.6473 (146.9723)  loss: 140.5860 (142.6067)  masked_loss: 1.4246 (1.4212)  tag_loss: 139.1850 (141.1855)  time: 1.4327 (1.8326)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0048)  time_gpu: 1.4274 (1.8273)  save_time: 8.8805 (16.0781)  lr: 0.000030  max mem: 26307
2022-03-17 03:05:04,126.126 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-17 03:05:04,126.126 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 163.01873779296875
2022-03-17 03:05:04,126.126 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.25642693810197
2022-03-17 03:05:27,285.285 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022087277844548225
2022-03-17 03:05:27,285.285 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:05:27,286.286 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'woman', 'looking', 'up', 'at', 'a', 'street', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:05:27,301.301 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sign', 'building', 'flower', 'wall', 'pole', 'tree', 'window', 'ground', 'head', '[UNK]', 'bush', 'hair', 'letter', 'plant', 'shirt', 'woman', 'post', 'face', 'handle', 'sky', 'hat', 'chair', 'pot', 'house', 'girl', 'toy', 'person', 'arm', 'statue', 'leg', 'vase', 'scarf', 'hand', 'shadow', 'shoe', 'sidewalk', 'light', 'stripe', 'mouth', 'umbrella', 'leaf', 'jacket', 'door', 'doll', 'top', 'decoration', 'short', 'man', 'fence', 'balloon']
2022-03-17 03:05:43,213.213 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'hand', 'face', 'building', 'street', 'woman', 'ground', 'hair', 'wall', 'arm', 'base', 'paper', 'plant', 'ball', 'letter', 'sign', 'jean', 'handle', 'dragon', 'coat', 'bush', 'pole', 'flower', 'jacket', 'bow', 'drum', 'glasses', 'pot', 'ribbon', 'shoe', 'decoration', 'container', 'sidewalk', 'stripe', 'vase', 'scarf']
2022-03-17 03:08:07,434.434 2829:trainer.py:487 do_train_dict(): eta: 9:30:28  iter: 46600  speed: 278.8 images/sec  total_norm: 147.1479 (149.2539)  loss: 140.7175 (140.9012)  masked_loss: 1.3701 (1.4033)  tag_loss: 139.2201 (139.4979)  time: 1.4328 (1.8367)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4278 (1.8316)  save_time: 8.8805 (16.0781)  lr: 0.000030  max mem: 26307
2022-03-17 03:08:07,794.794 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-17 03:08:07,795.795 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 147.6370849609375
2022-03-17 03:08:07,795.795 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.26492570145993
2022-03-17 03:08:31,569.569 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022109992802143097
2022-03-17 03:08:31,569.569 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:08:31,570.570 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'hot', 'dogs', '[MASK]', 'in', 'chili', 'and', 'sour', 'k', '##ra', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:08:31,585.585 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['dog', 'hot', 'paper', 'table', 'bun', 'mustard', '[UNK]', 'napkin', 'onion', 'food', 'cheese', 'foil', 'tray', 'topping', 'light', 'corn', 'plate', 'chili', 'container', 'sauce', 'sandwich', 'line', 'handle', 'bag', 'drink', 'floor', 'top', 'olive', 'wall', 'next', 'red', 'fork', 'white', 'plastic', 'straw', 'pepper', 'end', 'tin', 'glass', 'tomato', 'bread', 'meat', 'chip', 'rice', 'bottom', 'candy', 'close', 'pizza', 'letter', 'can']
2022-03-17 03:08:47,513.513 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'line', 'table', 'hot', 'dog', 'handle', 'cheese', 'towel', 'tray', 'lid', 'sauce', 'bean', 'sour', 'chili', 'napkin', 'onion', 'bun', 'mustard']
2022-03-17 03:11:11,100.100 2829:trainer.py:487 do_train_dict(): eta: 9:27:40  iter: 46700  speed: 278.8 images/sec  total_norm: 146.7118 (151.3979)  loss: 141.5708 (141.8499)  masked_loss: 1.4505 (1.4997)  tag_loss: 139.6504 (140.3502)  time: 1.4330 (1.8367)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4280 (1.8316)  save_time: 8.8805 (16.0781)  lr: 0.000030  max mem: 26307
2022-03-17 03:11:11,460.460 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-17 03:11:11,461.461 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 122.97488403320312
2022-03-17 03:11:11,461.461 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.2754523040902
2022-03-17 03:11:34,876.876 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022123076021671295
2022-03-17 03:11:34,876.876 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:11:34,876.876 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'old', 'truck', 'sitting', 'in', '[MASK]', '[MASK]', 'house', 'with', 'trees', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:11:34,892.892 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'truck', 'grill', 'windshield', '[UNK]', 'snow', 'plate', 'tire', 'bumper', 'ground', 'hood', 'license', 'sky', 'mirror', 'road', 'window', 'door', 'wheel', 'front', 'bush', 'building', 'snowy', 'light', 'logo', 'house', 'white', 'blue', 'shadow', 'roof', 'puddle', 'wood', 'car', 'handle', 'street', 'lot', 'number', 'forest', 'step', 'rim', 'fence', 'top', 'next', 'pine', 'old', 'parking', 'side', 'sign', 'trailer', 'pole', 'small']
2022-03-17 03:11:50,809.809 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'house', 'road', 'front', 'ground', 'window', 'tree', 'sky', 'snow', 'truck', 'plate', 'shadow', 'mirror', 'license', 'hood', 'tire', 'grill', 'windshield', 'bumper']
03-17 03:13:12.449 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 03:13:12.449 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 03:13:13.585 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 03:14:14,738.738 2829:trainer.py:487 do_train_dict(): eta: 9:24:51  iter: 46800  speed: 278.8 images/sec  total_norm: 144.6523 (146.5960)  loss: 140.8032 (142.8660)  masked_loss: 1.4183 (1.4734)  tag_loss: 139.4661 (141.3926)  time: 1.4327 (1.8364)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4276 (1.8311)  save_time: 8.8805 (16.0781)  lr: 0.000030  max mem: 26307
2022-03-17 03:14:15,103.103 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-17 03:14:15,104.104 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 169.4796905517578
2022-03-17 03:14:15,104.104 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.27002097091187
2022-03-17 03:14:38,810.810 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022122416645288467
2022-03-17 03:14:38,810.810 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:14:38,811.811 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'and', 'white', 'picture', 'of', 'chefs', 'cooking', 'in', 'a', 'kitchen', '.', 'sharpened', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:14:38,826.826 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'shirt', 'kitchen', 'head', '[UNK]', 'ceiling', 'bowl', 'light', 'table', 'wall', 'person', 'chef', 'food', 'watch', 'plate', 'shelf', 'hair', 'apron', 'pot', 'bottle', 'hand', 'ear', 'container', 'cup', 'pan', 'spoon', 'uniform', 'vent', 'handle', 'restaurant', 'tray', 'hat', 'woman', 'lid', 'door', 'group', 'arm', 'cutting', 'napkin', 'glove', 'floor', 'logo', 'hood', 'towel', 'dish', 'window', 'bag', 'photo', 'white', 'suit']
2022-03-17 03:14:54,741.741 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'black', 'white', 'light', 'hair', 'person', 'table', 'wall', 'food', 'watch', 'shirt', 'kitchen', 'picture', 'ear', 'bowl', 'plate', 'bottle', 'ceiling', 'cap', 'sink', 'bread', 'chef', 'shelf', 'container', 'lid', 'refrigerator', 'apron', 'kettle']
2022-03-17 03:17:18,255.255 2829:trainer.py:487 do_train_dict(): eta: 9:22:03  iter: 46900  speed: 279.0 images/sec  total_norm: 148.8805 (151.5371)  loss: 139.0197 (139.6487)  masked_loss: 1.4521 (1.4627)  tag_loss: 137.9297 (138.1860)  time: 1.4324 (1.8352)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4271 (1.8301)  save_time: 8.8805 (16.0781)  lr: 0.000029  max mem: 26307
2022-03-17 03:17:18,617.617 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-17 03:17:18,617.617 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 131.20040893554688
2022-03-17 03:17:18,617.617 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.28087819687863
2022-03-17 03:17:42,336.336 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022115055471658707
2022-03-17 03:17:42,337.337 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:17:42,337.337 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'some', 'players', '[MASK]', 'action', 'in', 'a', '[MASK]', 'game', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:17:42,352.352 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'helmet', 'grass', 'dirt', 'catcher', 'shoe', 'field', 'glove', 'shirt', 'fence', 'leg', 'bat', 'plate', 'mask', 'man', 'uniform', 'player', 'belt', 'ground', 'home', 'batter', 'jersey', 'umpire', 'head', 'hand', 'baseball', 'person', 'number', 'camera', 'line', 'face', 'game', 'arm', 'shin', 'hat', 'chair', 'sign', 'ball', 'banner', 'cooler', 'guard', 'stand', 'pad', 'railing', 'ready', 'towel', 'guards', 'woman', 'name', 'band']
2022-03-17 03:17:58,232.232 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'home', 'hand', 'number', 'game', 'face', 'band', 'player', 'field', 'ground', 'arm', 'action', 'baseball', 'sign', 'shirt', 'jersey', 'leg', 'camera', 'plate', 'grass', 'belt', 'uniform', 'dirt', 'bat', 'mask', 'fence', 'banner', 'helmet', 'shoe', 'catcher', 'glove', 'umpire', 'batter']
2022-03-17 03:20:22,027.027 2829:trainer.py:487 do_train_dict(): eta: 9:19:15  iter: 47000  speed: 278.6 images/sec  total_norm: 148.0130 (150.1703)  loss: 141.1111 (140.8762)  masked_loss: 1.4546 (1.4666)  tag_loss: 139.3108 (139.4095)  time: 1.4342 (1.8377)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4287 (1.8325)  save_time: 8.8805 (16.0781)  lr: 0.000029  max mem: 26307
2022-03-17 03:20:22,389.389 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-17 03:20:22,389.389 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 168.59767150878906
2022-03-17 03:20:22,390.390 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.27458203066686
2022-03-17 03:20:46,083.083 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022139830514788628
2022-03-17 03:20:46,083.083 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:20:46,084.084 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'train', 'cars', '[MASK]', 'on', '[MASK]', 'tracks', 'next', 'to', 'a', 'platform', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:20:46,099.099 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'sky', 'train', 'tree', 'platform', 'door', 'letter', 'number', 'roof', 'windshield', 'boat', 'handle', 'building', 'railing', 'track', 'bumper', 'cloud', 'rail', 'car', 'logo', 'vent', 'light', '[UNK]', 'line', 'pole', 'step', 'blue', 'front', 'bench', 'stair', 'seat', 'sign', 'person', 'mirror', 'bus', 'next', 'station', 'stripe', 'chair', 'gravel', 'top', 'fence', 'white', 'post', 'flag', 'wire', 'lot', 'background', 'man', 'side']
2022-03-17 03:21:02,050.050 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'number', 'building', 'door', 'car', 'window', 'train', 'tree', 'letter', 'machine', 'sky', 'platform', 'roof', 'handle', 'cloud', 'railing', 'bumper']
2022-03-17 03:23:25,859.859 2829:trainer.py:487 do_train_dict(): eta: 9:16:26  iter: 47100  speed: 278.5 images/sec  total_norm: 148.5389 (149.7176)  loss: 141.6600 (142.7763)  masked_loss: 1.4648 (1.4956)  tag_loss: 140.3538 (141.2807)  time: 1.4338 (1.8383)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4286 (1.8331)  save_time: 8.8805 (16.0781)  lr: 0.000029  max mem: 26307
2022-03-17 03:23:26,221.221 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-17 03:23:26,221.221 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 161.4774169921875
2022-03-17 03:23:26,221.221 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.28039753639092
2022-03-17 03:23:50,048.048 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02217692881822586
2022-03-17 03:23:50,048.048 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:23:50,049.049 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'party', '[MASK]', 'four', 'standing', 'at', '[MASK]', 'tennis', 'net', 'one', 'man', '[MASK]', 'wearing', 'a', 'costume', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:23:50,064.064 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'shirt', 'hair', 'fence', 'tree', '[UNK]', 'tennis', 'court', 'pole', 'hand', 'shoe', 'ground', 'head', 'line', 'leg', 'short', 'sock', 'arm', 'grass', 'jean', 'ball', 'person', 'handle', 'beard', 'net', 'bench', 'bush', 'face', 'sky', 'game', 'wall', 'shadow', 'hat', 'street', 'sidewalk', 'cap', 'guy', 'car', 'sign', 'sunglasses', 'glasses', 'couple', 'young', 'road', 'backpack', 'logo', 'ear', 'watch', 'boy', 'roof']
2022-03-17 03:24:06,080.080 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'party', 'woman', 'court', 'short', 'hair', 'person', 'standing', 'foot', 'tree', 'ball', 'jean', 'shirt', 'leg', 'tennis', 'coat', 'grass', 'belt', 'net', 'jacket', 'fence', 'collar', 'costume']
2022-03-17 03:26:29,735.735 2829:trainer.py:487 do_train_dict(): eta: 9:13:38  iter: 47200  speed: 278.5 images/sec  total_norm: 147.4293 (148.8538)  loss: 141.5915 (142.9183)  masked_loss: 1.4124 (1.4379)  tag_loss: 140.2686 (141.4804)  time: 1.4344 (1.8388)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4292 (1.8336)  save_time: 8.8805 (16.0781)  lr: 0.000029  max mem: 26307
2022-03-17 03:26:30,096.096 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-17 03:26:30,097.097 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 168.59213256835938
2022-03-17 03:26:30,097.097 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.27672050266386
2022-03-17 03:26:53,820.820 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02217678911983967
2022-03-17 03:26:53,820.820 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:26:53,820.820 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'room', 'with', 'a', 'tv', 'and', 'no', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:26:53,836.836 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['floor', 'wall', 'light', 'ceiling', 'room', 'box', 'door', '[UNK]', 'kitchen', 'cabinet', 'bag', 'chair', 'shelf', 'television', 'outlet', 'switch', 'table', 'living', 'refrigerator', 'carpet', 'couch', 'column', 'book', 'sofa', 'lid', 'shirt', 'fire', 'vent', 'suitcase', 'trash', 'towel', 'pillow', 'handle', 'pillar', 'window', 'microwave', 'hallway', 'man', 'picture', 'toy', 'leg', 'drawer', 'hood', 'pot', 'stool', 'flower', 'tile', 'person', 'can', 'cardboard']
2022-03-17 03:27:09,757.757 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'room', 'book', 'door', 'light', 'television', 'ground', 'tv', 'floor', 'wall', 'plant', 'gun', 'box', 'kitchen', 'screen', 'bag', 'handle', 'ceiling', 'flower', 'hallway', 'wire', 'furniture', 'pot', 'closet', 'carpet', 'towel', 'shoe', 'shelf', 'cord', 'lid', 'mat', 'refrigerator', 'vase', 'rug']
2022-03-17 03:29:33,763.763 2829:trainer.py:487 do_train_dict(): eta: 9:10:50  iter: 47300  speed: 278.2 images/sec  total_norm: 146.5983 (148.3750)  loss: 133.8196 (135.6978)  masked_loss: 1.5059 (1.5127)  tag_loss: 132.5114 (134.1851)  time: 1.4333 (1.8402)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4282 (1.8350)  save_time: 8.8805 (16.0781)  lr: 0.000029  max mem: 26307
2022-03-17 03:29:34,124.124 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7878788113594055
2022-03-17 03:29:34,124.124 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 131.19818115234375
2022-03-17 03:29:34,124.124 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.2912524340022
2022-03-17 03:29:57,680.680 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022175565361976624
2022-03-17 03:29:57,680.680 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:29:57,681.681 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'guys', 'jumping', 'for', 'a', 'fr', '##is', '##bee', 'while', 'others', 'watch', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:29:57,696.696 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'man', 'short', 'grass', 'shoe', 'tree', 'sock', 'hand', '[UNK]', 'arm', 'sunglasses', 'ground', 'hat', 'fence', 'boy', 'belt', 'field', 'cap', 'head', 'leg', 'person', 'shadow', 'hair', 'group', 'air', 'park', 'cone', 'watch', 'glasses', 'number', 'car', 'vest', 'design', 'game', 'other', 'face', 'bag', 'stripe', 'young', 'woman', 'pole', 'knee', 'couple', 'sign', 'trunk', 'grassy', 'pad', 'back', 'sidewalk', 'glove']
2022-03-17 03:30:13,627.627 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'air', 'short', 'field', 'ground', 'person', 'arm', 'boy', 'tree', 'shirt', 'leg', 'shadow', 'grass', 'belt', 'hat', 'fence', 'shoe', 'sunglasses', 'sock']
2022-03-17 03:32:37,620.620 2829:trainer.py:487 do_train_dict(): eta: 9:08:01  iter: 47400  speed: 278.5 images/sec  total_norm: 147.6856 (153.0433)  loss: 138.0872 (138.9916)  masked_loss: 1.4173 (1.4225)  tag_loss: 136.7552 (137.5691)  time: 1.4341 (1.8385)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4288 (1.8334)  save_time: 8.8805 (16.0781)  lr: 0.000029  max mem: 26307
2022-03-17 03:32:37,981.981 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5714285969734192
2022-03-17 03:32:37,981.981 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 136.08389282226562
2022-03-17 03:32:37,981.981 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.29869244224147
2022-03-17 03:33:01,925.925 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022162389010190964
2022-03-17 03:33:01,925.925 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:33:01,925.925 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'baseball', 'batter', ',', 'catcher', ',', 'and', 'umpire', 'get', 'ready', 'for', 'the', 'pitch', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:33:01,941.941 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['uniform', 'belt', 'man', '[UNK]', 'shirt', 'player', 'jersey', 'head', 'bat', 'line', 'helmet', 'field', 'glove', 'grass', 'baseball', 'catcher', 'shoe', 'hat', 'hand', 'strap', 'leg', 'back', 'mask', 'arm', 'umpire', 'number', 'cap', 'batter', 'plate', 'dirt', 'logo', 'home', 'net', 'patch', 'name', 'hair', 'shin', 'game', 'guard', 'ball', 'pitch', 'shoulder', 'base', 'ground', 'stripe', 'sock', 'sleeve', 'pole', 'ready', 'guards']
2022-03-17 03:33:17,887.887 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'back', 'get', 'head', 'man', 'name', 'hand', 'line', 'player', 'field', 'hair', 'arm', 'ready', 'baseball', 'shirt', 'jersey', 'leg', 'grass', 'belt', 'hat', 'cap', 'uniform', 'pitch', 'bat', 'mask', 'patch', 'helmet', 'shoe', 'catcher', 'glove', 'strap', 'bracelet', 'umpire', 'batter']
2022-03-17 03:35:41,614.614 2829:trainer.py:487 do_train_dict(): eta: 9:05:13  iter: 47500  speed: 278.3 images/sec  total_norm: 146.3037 (150.0780)  loss: 139.3985 (141.8122)  masked_loss: 1.4417 (1.4579)  tag_loss: 137.9291 (140.3543)  time: 1.4319 (1.8400)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4267 (1.8349)  save_time: 8.8805 (16.0781)  lr: 0.000028  max mem: 26307
2022-03-17 03:35:41,974.974 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-17 03:35:41,975.975 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 138.34457397460938
2022-03-17 03:35:41,975.975 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.30472671284394
2022-03-17 03:36:05,709.709 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02216068096458912
2022-03-17 03:36:05,710.710 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:36:05,710.710 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'is', 'swinging', 'a', 'tennis', 'rack', '##et', 'on', 'a', 'court', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:36:05,725.725 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['leg', '[UNK]', 'hand', 'tennis', 'court', 'shoe', 'hair', 'arm', 'short', 'head', 'shirt', 'handle', 'player', 'woman', 'man', 'wall', 'nose', 'ground', 'ball', 'face', 'mouth', 'sock', 'ear', 'shadow', 'logo', 'ponytail', 'line', 'skirt', 'person', 'dress', 'stripe', 'letter', 'foot', 'chair', 'band', 'outfit', 'sign', 'string', 'top', 'eye', 'floor', 'wrist', 'banner', 'hat', 'watch', 'cap', 'necklace', 'girl', 'cooler', 'female']
2022-03-17 03:36:21,612.612 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'line', 'player', 'woman', 'court', 'ground', 'hair', 'mouth', 'wall', 'arm', 'stand', 'chair', 'ball', 'letter', 'sky', 'shirt', 'platform', 'leg', 'dress', 'tennis', 'hat', 'cap', 'logo', 'shoe', 'outfit', 'sunglasses', 'bracelet', 'sock']
2022-03-17 03:38:45,842.842 2829:trainer.py:487 do_train_dict(): eta: 9:02:24  iter: 47600  speed: 277.9 images/sec  total_norm: 148.3192 (152.9970)  loss: 138.2993 (140.6993)  masked_loss: 1.5422 (1.5453)  tag_loss: 136.1622 (139.1540)  time: 1.4325 (1.8423)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4273 (1.8367)  save_time: 8.8805 (16.0781)  lr: 0.000028  max mem: 26307
2022-03-17 03:38:46,202.202 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4117647111415863
2022-03-17 03:38:46,202.202 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 140.47686767578125
2022-03-17 03:38:46,202.202 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.30535854283619
2022-03-17 03:39:10,270.270 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022198941558599472
2022-03-17 03:39:10,271.271 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:39:10,271.271 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'old', 'fire', 'hydra', '##nt', 'with', 'chip', '##ped', '[MASK]', 'has', 'rust', 'alpine', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:39:10,286.286 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'fire', 'cap', 'shirt', 'bolt', 'tree', 'man', 'top', 'wheel', 'ground', 'person', 'trunk', 'arm', 'face', 'paint', 'hat', 'hair', 'building', 'old', 'woman', 'fence', 'tag', 'branch', 'background', 'sky', 'dirt', 'hand', 'cart', 'wood', 'head', 'wall', 'knob', 'leaf', 'eye', 'pole', 'yellow', 'base', 'jacket', 'sidewalk', 'blue', 'plant', 'jean', 'side', 'grass', 'shoe', 'window', 'number', 'next', 'wooden', 'writing']
2022-03-17 03:39:26,209.209 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'old', 'face', 'body', 'top', 'fire', 'ground', 'person', 'eye', 'tree', 'shirt', 'wheel', 'hat', 'cap', 'paint', 'bolt', 'knob']
2022-03-17 03:41:49,875.875 2829:trainer.py:487 do_train_dict(): eta: 8:59:36  iter: 47700  speed: 278.2 images/sec  total_norm: 145.2389 (147.5624)  loss: 138.6443 (139.9719)  masked_loss: 1.4503 (1.4740)  tag_loss: 137.1254 (138.4979)  time: 1.4330 (1.8404)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4277 (1.8352)  save_time: 8.8805 (16.0781)  lr: 0.000028  max mem: 26307
2022-03-17 03:41:50,237.237 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-17 03:41:50,237.237 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 152.892333984375
2022-03-17 03:41:50,237.237 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.31035914879962
2022-03-17 03:42:14,059.059 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02220848947763443
2022-03-17 03:42:14,059.059 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:42:14,060.060 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'cat', 'on', 'a', 'table', 'chewing', 'the', 'edge', 'of', 'a', 'book', '[MASK]', 'is', 'lying', 'beside', 'it', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:42:14,075.075 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cat', 'head', 'ear', 'table', 'nose', 'book', 'face', 'eye', 'man', 'writing', 'shirt', '[UNK]', 'hair', 'paw', 'leg', 'floor', 'mouth', 'cord', 'kitten', 'letter', 'word', 'chair', 'carpet', 'magazine', 'orange', 'desk', 'cover', 'marker', 'hand', 'pen', 'girl', 'next', 'paper', 'photo', 'wooden', 'tail', 'wire', 'glasses', 'finger', 'line', 'image', 'dot', 'nail', 'white', 'handle', 'button', 'top', 'shadow', 'picture', 'straw']
2022-03-17 03:42:29,945.945 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'book', 'mouth', 'floor', 'word', 'table', 'writing', 'eye', 'edge', 'letter', 'shirt', 'leg', 'nose', 'ear', 'desk', 'object', 'cat', 'clothing', 'carpet', 'dot']
03-17 03:43:13.611 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 03:43:13.611 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 03:43:14.850 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 90}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}]
2022-03-17 03:44:54,011.011 2829:trainer.py:487 do_train_dict(): eta: 8:56:47  iter: 47800  speed: 278.1 images/sec  total_norm: 146.9491 (151.2876)  loss: 137.7076 (140.0496)  masked_loss: 1.4431 (1.4877)  tag_loss: 136.3362 (138.5619)  time: 1.4329 (1.8413)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4276 (1.8362)  save_time: 8.8805 (16.0781)  lr: 0.000028  max mem: 26307
2022-03-17 03:44:54,372.372 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-17 03:44:54,372.372 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.2831573486328
2022-03-17 03:44:54,372.372 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.3158869494476
2022-03-17 03:45:18,415.415 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02220868691802025
2022-03-17 03:45:18,415.415 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:45:18,415.415 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'person', 'holding', 'a', 'broken', 'cell', 'phone', 'while', 'looking', 'at', 'the', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:45:18,431.431 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'phone', 'finger', 'person', 'cell', 'screen', '[UNK]', 'nail', 'thumb', 'button', 'face', 'table', 'logo', 'smart', 'woman', 'camera', 'picture', 'light', 'man', 'reflection', 'palm', 'device', 'eye', 'shirt', 'glasses', 'shadow', 'bowl', 'background', 'cord', 'wall', 'speaker', 'ring', 'lip', 'iphone', 'key', 'ear', 'head', 'hair', 'rim', 'glass', 'small', 'electronic', 'close', 'floor', 'cloth', 'screw', 'front', 'handle', 'room', 'base']
2022-03-17 03:45:34,323.323 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['hand', 'person', 'phone', 'paper', 'cell', 'broken', 'screen', 'finger', 'camera', 'thumb']
2022-03-17 03:47:58,202.202 2829:trainer.py:487 do_train_dict(): eta: 8:53:58  iter: 47900  speed: 278.0 images/sec  total_norm: 147.0159 (151.2230)  loss: 140.2948 (140.2809)  masked_loss: 1.3591 (1.4616)  tag_loss: 138.4697 (138.8192)  time: 1.4328 (1.8419)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4277 (1.8367)  save_time: 8.8805 (16.0781)  lr: 0.000028  max mem: 26307
2022-03-17 03:47:58,562.562 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-17 03:47:58,562.562 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 158.79212951660156
2022-03-17 03:47:58,563.563 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.32025716304778
2022-03-17 03:48:22,818.818 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022234836593270302
2022-03-17 03:48:22,818.818 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:48:22,819.819 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'black', 'cat', 'sits', 'on', 'a', 'rug', 'with', 'a', 'red', 'cord', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:48:22,834.834 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['floor', 'wall', 'rug', 'head', 'ear', 'cat', 'paw', 'tile', 'mat', 'door', 'dog', 'eye', 'nose', 'leg', 'bag', 'face', '[UNK]', 'tail', 'line', 'carpet', 'black', 'collar', 'shoe', 'leash', 'handle', 'room', 'tag', 'stripe', 'cabinet', 'kitchen', 'refrigerator', 'clothes', 'ground', 'foot', 'knob', 'cord', 'outlet', 'person', 'container', 'box', 'towel', 'strap', 'bed', 'next', 'can', 'bathroom', 'light', 'jean', 'small', 'chair']
2022-03-17 03:48:38,854.854 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'room', 'black', 'door', 'red', 'floor', 'wall', 'eye', 'nose', 'ear', 'cat', 'tag', 'bow', 'tape', 'ribbon', 'cord', 'mat', 'tile', 'rug', 'paw', 'leash']
2022-03-17 03:51:02,751.751 2829:trainer.py:487 do_train_dict(): eta: 8:51:10  iter: 48000  speed: 277.4 images/sec  total_norm: 147.9644 (150.7946)  loss: 141.1832 (139.7102)  masked_loss: 1.4822 (1.4622)  tag_loss: 139.7764 (138.2479)  time: 1.4342 (1.8455)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4290 (1.8402)  save_time: 8.8805 (16.0781)  lr: 0.000028  max mem: 26307
2022-03-17 03:51:03,112.112 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5142857432365417
2022-03-17 03:51:03,112.112 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 128.7156219482422
2022-03-17 03:51:03,113.113 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.324152440886
2022-03-17 03:51:27,361.361 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022246699780225754
2022-03-17 03:51:27,362.362 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:51:27,362.362 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'young', 'boy', 'in', '[MASK]', '[MASK]', 'standing', 'in', 'the', 'snow', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:51:27,378.378 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['snow', '[UNK]', 'ground', 'jacket', 'glove', 'hood', 'pole', 'ski', 'stripe', 'boot', 'boy', 'tree', 'hat', 'face', 'head', 'leg', 'child', 'mouth', 'sunglasses', 'hand', 'coat', 'nose', 'strap', 'shoe', 'young', 'foot', 'person', 'girl', 'tag', 'little', 'snowy', 'kid', 'ear', 'blue', 'small', 'arm', 'woman', 'cuff', 'stick', 'sock', 'bush', 'country', 'sky', 'vest', 'skier', 'track', 'branch', 'gear', 'skiing', 'slope']
2022-03-17 03:51:43,396.396 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'young', 'ground', 'blue', 'mouth', 'boy', 'tree', 'leg', 'nose', 'snow', 'hat', 'tag', 'pole', 'jacket', 'hood', 'ski', 'boot', 'glove', 'strap', 'stripe']
2022-03-17 03:54:06,778.778 2829:trainer.py:487 do_train_dict(): eta: 8:48:21  iter: 48100  speed: 278.2 images/sec  total_norm: 147.5848 (149.1245)  loss: 143.4307 (144.7334)  masked_loss: 1.4259 (1.4422)  tag_loss: 141.9932 (143.2913)  time: 1.4318 (1.8403)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4266 (1.8351)  save_time: 8.8805 (16.0781)  lr: 0.000028  max mem: 26307
2022-03-17 03:54:07,139.139 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-17 03:54:07,139.139 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 132.91464233398438
2022-03-17 03:54:07,139.139 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.3353139077974
2022-03-17 03:54:31,475.475 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022274794057011604
2022-03-17 03:54:31,476.476 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:54:31,476.476 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'laptop', 'computer', 'set', 'up', 'on', 'a', 'makeshift', 'cardboard', '[MASK]', 'on', 'a', 'desk', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:54:31,491.491 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['laptop', 'table', 'keyboard', 'box', 'desk', 'screen', 'paper', 'computer', '[UNK]', 'pen', 'wall', 'book', 'key', 'button', 'cord', 'chair', 'speaker', 'mouse', 'pencil', 'wire', 'bag', 'cup', 'writing', 'window', 'handle', 'logo', 'stand', 'notebook', 'ear', 'shelf', 'pad', 'container', 'top', 'monitor', 'plug', 'phone', 'marker', 'envelope', 'head', 'tape', 'cable', 'hand', 'light', 'coffee', 'card', 'picture', 'lid', 'remote', 'door', 'arm']
2022-03-17 03:54:47,468.468 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['cup', 'table', 'wall', 'key', 'stand', 'paper', 'computer', 'box', 'wood', 'screen', 'card', 'desk', 'map', 'handle', 'button', 'wire', 'monitor', 'keyboard', 'holder', 'envelope', 'cord', 'marker', 'laptop', 'pencil', 'makeshift', 'cardboard', 'scissors']
2022-03-17 03:57:11,188.188 2829:trainer.py:487 do_train_dict(): eta: 8:45:32  iter: 48200  speed: 277.6 images/sec  total_norm: 148.8483 (152.8994)  loss: 143.9297 (144.0957)  masked_loss: 1.5178 (1.5530)  tag_loss: 141.9299 (142.5427)  time: 1.4320 (1.8442)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4270 (1.8390)  save_time: 8.8805 (16.0781)  lr: 0.000027  max mem: 26307
2022-03-17 03:57:11,548.548 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5135135054588318
2022-03-17 03:57:11,549.549 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.86636352539062
2022-03-17 03:57:11,549.549 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.34213658840267
2022-03-17 03:57:35,708.708 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022399011999368668
2022-03-17 03:57:35,709.709 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 03:57:35,709.709 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'is', 'an', 'elephant', 'statue', 'that', 'is', 'on', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 03:57:35,724.724 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'elephant', 'trunk', 'tree', 'foot', 'ear', 'head', 'shadow', 'eye', 'refrigerator', 'ground', 'door', 'fence', 'sign', '[UNK]', 'window', 'leg', 'truck', 'car', 'building', 'booth', 'park', 'sky', 'trailer', 'path', 'letter', 'tire', 'statue', 'person', 'road', 'sidewalk', 'hand', 'box', 'van', 'base', 'walkway', 'shirt', 'machine', 'boy', 'roof', 'shed', 'large', 'chain', 'man', 'gate', 'bush', 'front', 'house', 'top', 'street']
2022-03-17 03:57:51,623.623 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'road', 'park', 'ground', 'eye', 'foot', 'window', 'tree', 'sky', 'path', 'leg', 'ear', 'truck', 'shadow', 'grass', 'trunk', 'fence', 'booth', 'trailer', 'elephant', 'refrigerator']
2022-03-17 04:00:15,797.797 2829:trainer.py:487 do_train_dict(): eta: 8:42:43  iter: 48300  speed: 277.3 images/sec  total_norm: 146.8591 (151.2959)  loss: 141.2348 (142.8129)  masked_loss: 1.4665 (1.4911)  tag_loss: 139.5633 (141.3218)  time: 1.4332 (1.8460)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4281 (1.8410)  save_time: 8.8805 (16.0781)  lr: 0.000027  max mem: 26307
2022-03-17 04:00:16,159.159 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6857143044471741
2022-03-17 04:00:16,159.159 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 135.21192932128906
2022-03-17 04:00:16,159.159 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.342319496407
2022-03-17 04:00:40,559.559 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02240808866918087
2022-03-17 04:00:40,560.560 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:00:40,560.560 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'a', 'skate', '##board', 'doing', 'a', 'trick', 'on', 'a', 'cement', 'wall', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:00:40,576.576 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'wall', 'shoe', 'hand', 'painting', 'shirt', 'man', 'leg', 'arm', 'jean', 'hair', 'head', 'wheel', 'map', 'tree', 'board', 'building', 'boy', 'hat', 'sidewalk', 'window', 'face', 'jacket', 'person', 'art', 'glasses', 'picture', 'ramp', 'sign', 'cap', 'branch', 'coat', 'bench', 'short', 'pole', 'ear', 'artwork', 'ground', 'door', 'block', 'watch', 'young', 'sweater', 'woman', 'paint', 'foot', 'step', 'trick', 'sock', 'light']
2022-03-17 04:00:56,505.505 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'line', 'hair', 'wall', 'arm', 'boy', 'painting', 'leg', 'map', 'wheel', 'hat', 'cap', 'jacket', 'trick', 'reflection', 'sleeve', 'shoe', 'cement']
2022-03-17 04:03:20,532.532 2829:trainer.py:487 do_train_dict(): eta: 8:39:54  iter: 48400  speed: 277.2 images/sec  total_norm: 147.0013 (149.5911)  loss: 141.6716 (143.8367)  masked_loss: 1.4934 (1.5380)  tag_loss: 140.4748 (142.2988)  time: 1.4344 (1.8473)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4292 (1.8421)  save_time: 8.8805 (16.0781)  lr: 0.000027  max mem: 26307
2022-03-17 04:03:20,893.893 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7058823704719543
2022-03-17 04:03:20,893.893 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 127.52845001220703
2022-03-17 04:03:20,894.894 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.35437128912542
2022-03-17 04:03:45,201.201 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022411055862903595
2022-03-17 04:03:45,202.202 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:03:45,203.203 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'there', 'is', 'dunes', 'click', 'tower', 'on', 'the', 'front', '[MASK]', 'the', 'building', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:03:45,218.218 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'sky', 'tree', 'window', 'clock', 'tower', '[UNK]', 'wall', 'church', 'dome', 'roof', 'hand', 'top', 'sign', 'spire', 'cross', 'cloud', 'arch', 'tall', 'large', 'light', 'street', 'view', 'pole', 'weather', 'statue', 'yellow', 'background', 'vane', 'old', 'front', 'white', 'wire', 'fence', 'person', 'middle', 'door', 'flag', 'archway', 'circle', 'image', 'city', 'flower', 'leaf', 'ornate', 'side', 'lamp', 'banner', 'bell', 'line']
2022-03-17 04:04:01,204.204 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'building', 'front', 'wall', 'window', 'tree', 'tower', 'sky', 'clock', 'cloud', 'arch', 'dome', 'click']
2022-03-17 04:06:25,064.064 2829:trainer.py:487 do_train_dict(): eta: 8:37:05  iter: 48500  speed: 277.5 images/sec  total_norm: 148.7495 (151.7752)  loss: 138.5834 (139.9065)  masked_loss: 1.4741 (1.4953)  tag_loss: 136.8444 (138.4111)  time: 1.4329 (1.8453)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4278 (1.8401)  save_time: 8.8805 (16.0781)  lr: 0.000027  max mem: 26307
2022-03-17 04:06:25,426.426 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-17 04:06:25,426.426 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 153.03562927246094
2022-03-17 04:06:25,426.426 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.35484857912417
2022-03-17 04:06:49,925.925 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022365126758813858
2022-03-17 04:06:49,926.926 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:06:49,926.926 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'elephants', 'in', 'tall', 'dry', 'grass', 'next', 'to', '[MASK]', 'pistols', 'of', 'water', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:06:49,942.942 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'elephant', 'water', 'tree', 'trunk', 'herd', 'field', 'river', 'animal', '[UNK]', 'ear', 'sky', 'bush', 'tail', 'rock', 'head', 'bank', 'plant', 'group', 'body', 'large', 'bird', 'palm', 'green', 'grassy', 'leg', 'horn', 'leaf', 'zebra', 'branch', 'ripple', 'wild', 'wood', 'cow', 'plain', 'shore', 'dirt', 'watering', 'tall', 'cloud', 'buffalo', 'horse', 'stream', 'hole', 'land', 'lake', 'couple', 'reflection', 'hill', 'area']
2022-03-17 04:07:05,864.864 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'water', 'body', 'field', 'tree', 'sky', 'tall', 'dry', 'bird', 'palm', 'grass', 'bush', 'plain', 'trunk', 'elephant', 'herd']
2022-03-17 04:09:29,724.724 2829:trainer.py:487 do_train_dict(): eta: 8:34:16  iter: 48600  speed: 277.3 images/sec  total_norm: 148.2089 (150.7668)  loss: 140.1087 (140.9313)  masked_loss: 1.4378 (1.4482)  tag_loss: 138.0247 (139.4831)  time: 1.4348 (1.8467)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4296 (1.8415)  save_time: 8.8805 (16.0781)  lr: 0.000027  max mem: 26307
2022-03-17 04:09:30,088.088 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6216216087341309
2022-03-17 04:09:30,088.088 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.89089965820312
2022-03-17 04:09:30,088.088 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.35116856788463
2022-03-17 04:09:54,593.593 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022373618558049202
2022-03-17 04:09:54,594.594 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:09:54,594.594 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'black', '[MASK]', '279', 'sitting', 'by', 'a', 'fa', '##uce', '##t', 'of', '[MASK]', 'water', 'in', 'a', 'tub', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:09:54,609.609 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cat', 'head', 'ear', 'eye', 'wall', '[UNK]', 'bathroom', 'toilet', 'sink', 'floor', 'black', 'tile', 'nose', 'lid', 'tub', 'water', 'handle', 'drain', 'leg', 'face', 'shower', 'mirror', 'knob', 'paw', 'camera', 'rug', 'bottle', 'animal', 'holder', 'white', 'door', 'tank', 'collar', 'towel', 'paper', 'soap', 'tag', 'cabinet', 'ledge', 'can', 'pipe', 'reflection', 'shadow', 'next', 'mat', 'brush', 'curtain', 'bar', 'bowl', 'person']
2022-03-17 04:10:10,551.551 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'water', 'black', 'wall', 'eye', 'ear', 'cat', 'handle', 'bathroom', 'sink', 'drain', 'tub']
2022-03-17 04:12:34,264.264 2829:trainer.py:487 do_train_dict(): eta: 8:31:27  iter: 48700  speed: 277.4 images/sec  total_norm: 147.6346 (150.0370)  loss: 139.4770 (141.0584)  masked_loss: 1.3989 (1.4448)  tag_loss: 138.1284 (139.6136)  time: 1.4320 (1.8454)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4266 (1.8399)  save_time: 8.8805 (16.0781)  lr: 0.000027  max mem: 26307
2022-03-17 04:12:34,625.625 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-17 04:12:34,626.626 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 133.9613037109375
2022-03-17 04:12:34,626.626 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.3578565394292
2022-03-17 04:12:59,166.166 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022435788065195084
2022-03-17 04:12:59,166.166 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:12:59,167.167 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '##raf', '##fe', 'in', 'a', '[MASK]', 'with', 'a', 'person', 'near', 'trees', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:12:59,182.182 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'head', 'bush', 'ear', '[UNK]', 'wood', 'forest', 'grass', 'eye', 'neck', 'branch', 'trunk', 'horn', 'nose', 'face', 'ground', 'hair', 'spot', 'man', 'tail', 'leg', 'mane', 'field', 'shirt', 'camera', 'dirt', 'animal', 'mouth', 'brush', 'leaf', 'sky', 'area', 'plant', 'next', 'green', 'group', 'hat', 'log', 'couple', 'hand', 'lush', 'jungle', 'small', 'wooded', 'tall', 'path', 'standing', 'person', 'fence', 'front']
03-17 04:13:14.951 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 04:13:14.951 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
2022-03-17 04:13:15,081.081 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'field', 'ground', 'hair', 'person', 'arm', 'forest', 'eye', 'neck', 'tree', 'wood', 'shirt', 'leg', 'ear', 'grass', 'belt', 'bush', 'horn']
03-17 04:13:15.772 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 4}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 37}]
2022-03-17 04:15:38,819.819 2829:trainer.py:487 do_train_dict(): eta: 8:28:38  iter: 48800  speed: 277.4 images/sec  total_norm: 145.3937 (151.0171)  loss: 137.4663 (139.8789)  masked_loss: 1.3957 (1.4348)  tag_loss: 136.2184 (138.4440)  time: 1.4329 (1.8455)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4276 (1.8403)  save_time: 8.8805 (16.0781)  lr: 0.000027  max mem: 26307
2022-03-17 04:15:39,180.180 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-17 04:15:39,180.180 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 112.019775390625
2022-03-17 04:15:39,180.180 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.36997084822391
2022-03-17 04:16:03,580.580 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022514278069138527
2022-03-17 04:16:03,581.581 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:16:03,581.581 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'tall', 'os', '##tric', '##h', '[MASK]', '[MASK]', 'a', 'lush', 'green', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:16:03,596.596 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'grass', '[UNK]', 'neck', 'bush', 'bird', 'trunk', 'branch', 'field', 'head', 'leg', 'feather', 'flower', 'tail', 'black', 'ground', 'wing', 'plant', 'forest', 'large', 'sky', 'beak', 'grassy', 'wild', 'log', 'background', 'area', 'tall', 'rock', 'body', 'group', 'wood', 'standing', 'pine', 'next', 'top', 'green', 'palm', 'white', 'couple', 'wooded', 'dirt', 'hill', 'other', 'walking', 'brush', 'bear', 'front', 'date', 'middle']
2022-03-17 04:16:19,487.487 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'field', 'green', 'neck', 'tree', 'branch', 'leg', 'tall', 'bird', 'grass', 'tail', 'bush', 'flower', 'trunk', 'feather', 'lush']
2022-03-17 04:18:43,361.361 2829:trainer.py:487 do_train_dict(): eta: 8:25:49  iter: 48900  speed: 277.4 images/sec  total_norm: 150.1504 (153.1115)  loss: 139.2478 (140.6943)  masked_loss: 1.3477 (1.3736)  tag_loss: 138.0876 (139.3207)  time: 1.4322 (1.8454)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4270 (1.8402)  save_time: 8.8805 (16.0781)  lr: 0.000026  max mem: 26307
2022-03-17 04:18:43,723.723 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-17 04:18:43,723.723 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.8594970703125
2022-03-17 04:18:43,723.723 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.37973635148029
2022-03-17 04:19:08,361.361 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022561753168702126
2022-03-17 04:19:08,361.361 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:19:08,361.361 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'and', 'woman', 'walking', 'down', 'a', '[MASK]', '[MASK]', 'street', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:19:08,377.377 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['jacket', '[UNK]', 'umbrella', 'sidewalk', 'person', 'ground', 'shoe', 'head', 'man', 'rain', 'reflection', 'hood', 'coat', 'leg', 'hand', 'vest', 'arm', 'hair', 'bag', 'line', 'face', 'building', 'woman', 'sign', 'nose', 'light', 'street', 'purse', 'handle', 'boy', 'jean', 'wall', 'scarf', 'suitcase', 'tree', 'pole', 'boot', 'car', 'logo', 'hat', 'stripe', 'strap', 'road', 'zipper', 'wheel', 'leaf', 'mouth', 'window', 'floor', 'child']
2022-03-17 04:19:24,433.433 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'street', 'woman', 'ground', 'hair', 'person', 'jean', 'shirt', 'bag', 'rain', 'handle', 'jacket', 'leaf', 'purse', 'reflection', 'shoe', 'sidewalk', 'umbrella', 'soaked', 'vest', 'stripe', 'scarf']
2022-03-17 04:21:47,816.816 2829:trainer.py:487 do_train_dict(): eta: 8:23:00  iter: 49000  speed: 277.6 images/sec  total_norm: 149.1608 (151.3031)  loss: 144.4656 (144.3309)  masked_loss: 1.4613 (1.4955)  tag_loss: 143.1575 (142.8353)  time: 1.4313 (1.8445)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4258 (1.8393)  save_time: 8.8805 (16.0781)  lr: 0.000026  max mem: 26307
2022-03-17 04:21:48,179.179 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-17 04:21:48,179.179 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 137.80441284179688
2022-03-17 04:21:48,179.179 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.38384796111501
2022-03-17 04:22:12,755.755 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02256636507809162
2022-03-17 04:22:12,755.755 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:22:12,756.756 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'road', 'crew', 'of', 'a', 'large', 'city', 'street', 'with', '[MASK]', '[MASK]', 'above', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:22:12,771.771 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'tree', 'vest', 'building', '[UNK]', 'line', 'jacket', 'sky', 'shoe', 'window', 'person', 'curb', 'ground', 'road', 'sidewalk', 'street', 'tire', 'fire', 'truck', 'head', 'sign', 'cart', 'car', 'worker', 'shirt', 'jean', 'hat', 'coat', 'safety', 'dirt', 'door', 'plant', 'house', 'cone', 'light', 'hose', 'pole', 'fence', 'city', 'hair', 'leaf', 'wall', 'wheel', 'construction', 'cap', 'hand', 'yellow', 'vehicle', 'stripe', 'ladder']
2022-03-17 04:22:28,751.751 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'man', 'line', 'building', 'large', 'road', 'street', 'car', 'ground', 'person', 'sun', 'window', 'tree', 'crew', 'sign', 'sky', 'jean', 'truck', 'jacket', 'toy', 'shoe', 'tire', 'cone', 'curb', 'suv', 'vest']
2022-03-17 04:24:52,660.660 2829:trainer.py:487 do_train_dict(): eta: 8:20:10  iter: 49100  speed: 277.0 images/sec  total_norm: 147.2410 (149.9541)  loss: 139.6843 (139.8121)  masked_loss: 1.4377 (1.4726)  tag_loss: 137.8044 (138.3395)  time: 1.4323 (1.8485)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4270 (1.8433)  save_time: 8.8805 (16.0781)  lr: 0.000026  max mem: 26307
2022-03-17 04:24:53,020.020 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-17 04:24:53,020.020 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 132.60842895507812
2022-03-17 04:24:53,020.020 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.39298798010601
2022-03-17 04:25:17,735.735 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02259107679128647
2022-03-17 04:25:17,735.735 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:25:17,735.735 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'woman', 'riding', 'on', 'the', '[MASK]', 'of', 'an', 'elephant', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:25:17,751.751 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'window', 'elephant', 'leg', 'trunk', 'shirt', 'hair', 'ear', 'head', 'eye', 'short', 'sign', 'tree', 'sock', '[UNK]', 'foot', 'man', 'plant', 'door', 'wall', 'palm', 'shoe', 'person', 'woman', 'ground', 'fence', 'doorway', 'leaf', 'mouth', 'girl', 'archway', 'statue', 'jean', 'animal', 'hand', 'chair', 'arm', 'boot', 'tail', 'bush', 'sidewalk', 'belt', 'face', 'logo', 'poster', 'branch', 'pole', 'arch', 'bag', 'railing']
2022-03-17 04:25:33,692.692 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'back', 'head', 'building', 'door', 'woman', 'short', 'hair', 'girl', 'mouth', 'table', 'eye', 'chair', 'plant', 'foot', 'window', 'tree', 'sign', 'shirt', 'platform', 'leg', 'ear', 'palm', 'tail', 'statue', 'logo', 'trunk', 'boot', 'elephant', 'shoe', 'poster', 'advertisement', 'patio', 'archway', 'sock']
2022-03-17 04:27:57,500.500 2829:trainer.py:487 do_train_dict(): eta: 8:17:21  iter: 49200  speed: 277.0 images/sec  total_norm: 146.8042 (150.1229)  loss: 137.9583 (140.2049)  masked_loss: 1.3763 (1.4622)  tag_loss: 136.1676 (138.7428)  time: 1.4323 (1.8484)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4272 (1.8433)  save_time: 8.8805 (16.0781)  lr: 0.000026  max mem: 26307
2022-03-17 04:27:57,861.861 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7272727489471436
2022-03-17 04:27:57,861.861 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 140.09893798828125
2022-03-17 04:27:57,862.862 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.39954369256753
2022-03-17 04:28:22,737.737 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022620538249611855
2022-03-17 04:28:22,737.737 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:28:22,737.737 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'very', 'nicely', 'dressed', 'man', 'standing', 'by', '[MASK]', 'door', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:28:22,753.753 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['floor', 'jacket', 'shirt', '[UNK]', 'tie', 'shoe', 'suit', 'wall', 'door', 'belt', 'man', 'face', 'hair', 'tag', 'leg', 'collar', 'room', 'arm', 'neck', 'head', 'name', 'hand', 'clothes', 'mouth', 'foot', 'nose', 'glasses', 'knot', 'ear', 'coat', 'eye', 'window', 'ceiling', 'book', 'beard', 'buckle', 'wheel', 'bag', 'outlet', 'table', 'cord', 'picture', 'sock', 'chair', 'paper', 'frame', 'boot', 'hat', 'curtain', 'front']
2022-03-17 04:28:38,638.638 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'name', 'hand', 'face', 'room', 'door', 'hair', 'floor', 'wall', 'arm', 'eye', 'paper', 'neck', 'foot', 'window', 'sign', 'shirt', 'leg', 'clothes', 'nose', 'ear', 'suit', 'pocket', 'tie', 'belt', 'blind', 'tag', 'jacket', 'collar', 'boot', 'beard', 'shoe', 'poster', 'knob']
2022-03-17 04:31:02,206.206 2829:trainer.py:487 do_train_dict(): eta: 8:14:32  iter: 49300  speed: 277.2 images/sec  total_norm: 147.5067 (149.0720)  loss: 137.8561 (137.0620)  masked_loss: 1.3613 (1.3977)  tag_loss: 135.8996 (135.6643)  time: 1.4309 (1.8471)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4256 (1.8419)  save_time: 8.8805 (16.0781)  lr: 0.000026  max mem: 26307
2022-03-17 04:31:02,566.566 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7647058963775635
2022-03-17 04:31:02,567.567 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 168.91876220703125
2022-03-17 04:31:02,567.567 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.3919389161021
2022-03-17 04:31:27,455.455 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02261926420032978
2022-03-17 04:31:27,456.456 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:31:27,456.456 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'pair', '##হ', '[MASK]', 'playing', 'in', 'pool', 'at', 'zoo', 'environment', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:31:27,471.471 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bear', 'water', 'ear', 'nose', 'head', 'eye', 'face', 'fur', 'mouth', 'leg', 'snout', 'brown', 'large', 'rock', 'back', 'tree', 'tongue', '[UNK]', 'hair', 'animal', 'grass', 'sky', 'foot', 'black', 'reflection', 'neck', 'dog', 'paw', 'river', 'wave', 'ripple', 'pool', 'teeth', 'furry', 'big', 'log', 'snow', 'background', 'mountain', 'pole', 'wall', 'light', 'polar', 'swimming', 'horn', 'plant', 'small', 'fish', 'long', 'next']
2022-03-17 04:31:43,374.374 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'water', 'mouth', 'chest', 'eye', 'neck', 'dog', 'environment', 'teeth', 'animal', 'leg', 'tongue', 'nose', 'ear', 'bear', 'pool', 'zoo']
2022-03-17 04:34:07,220.220 2829:trainer.py:487 do_train_dict(): eta: 8:11:42  iter: 49400  speed: 276.7 images/sec  total_norm: 150.4810 (151.2351)  loss: 141.0749 (140.9587)  masked_loss: 1.3648 (1.3966)  tag_loss: 139.6023 (139.5621)  time: 1.4334 (1.8501)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4283 (1.8450)  save_time: 8.8805 (16.0781)  lr: 0.000026  max mem: 26307
2022-03-17 04:34:07,583.583 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-17 04:34:07,583.583 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 170.29539489746094
2022-03-17 04:34:07,583.583 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.38903579711913
2022-03-17 04:34:32,542.542 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02261706255376339
2022-03-17 04:34:32,543.543 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:34:32,543.543 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'golden', 'retrieve', '##r', 'and', 'a', 'pit', 'bull', 'sitting', '[MASK]', 'the', 'back', 'of', '[unused465]', 'pickup', 'truck', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:34:32,558.558 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['car', 'tree', 'light', 'dog', 'building', 'sky', 'ear', 'street', 'sign', 'mirror', '[UNK]', 'pole', 'collar', 'nose', 'road', 'eye', 'sidewalk', 'background', 'window', 'head', 'leg', 'tire', 'tail', 'motorcycle', 'windshield', 'traffic', 'seat', 'truck', 'license', 'plate', 'line', 'paw', 'curb', 'wheel', 'grill', 'bike', 'vehicle', 'back', 'city', 'suv', 'trunk', 'ground', 'wall', 'door', 'bumper', 'man', 'person', 'harness', 'next', 'handle']
2022-03-17 04:34:48,426.426 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'back', 'head', 'line', 'building', 'door', 'road', 'street', 'light', 'car', 'fire', 'post', 'window', 'tree', 'golden', 'sign', 'sky', 'dog', 'traffic', 'nose', 'ear', 'truck', 'plate', 'mirror', 'license', 'pole', 'flower', 'pit', 'bull', 'trunk', 'collar', 'lamp', 'trash', 'sidewalk', 'pickup', 'suv', 'grill', 'windshield']
2022-03-17 04:37:12,111.111 2829:trainer.py:487 do_train_dict(): eta: 8:08:53  iter: 49500  speed: 276.9 images/sec  total_norm: 147.0919 (148.2388)  loss: 143.1318 (143.2327)  masked_loss: 1.4506 (1.5007)  tag_loss: 141.5413 (141.7319)  time: 1.4313 (1.8489)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4263 (1.8438)  save_time: 8.8805 (16.0781)  lr: 0.000025  max mem: 26307
2022-03-17 04:37:12,475.475 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7297297120094299
2022-03-17 04:37:12,475.475 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 137.5308837890625
2022-03-17 04:37:12,475.475 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.39715541562727
2022-03-17 04:37:37,496.496 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02263442426919937
2022-03-17 04:37:37,497.497 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:37:37,497.497 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', '[MASK]', 'casts', 'a', 'shadow', 'on', 'four', 'chairs', 'on', 'a', 'beach', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:37:37,514.514 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['beach', 'chair', 'sand', 'umbrella', 'water', 'person', 'sky', 'ocean', 'wave', 'shadow', 'pole', 'table', 'cloud', 'lounge', 'palm', '[UNK]', 'mountain', 'shore', 'boat', 'towel', 'cushion', 'sandy', 'tree', 'rock', 'lawn', 'horizon', 'bird', 'dog', 'bench', 'footprint', 'roof', 'patio', 'top', 'man', 'hill', 'hut', 'view', 'resort', 'couple', 'many', 'sun', 'building', 'background', 'large', 'canopy', 'straw', 'sunny', 'empty', 'ground', 'wall']
2022-03-17 04:37:53,433.433 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['water', 'person', 'table', 'chair', 'beach', 'sky', 'ocean', 'wave', 'bird', 'shadow', 'sand', 'cloud', 'pole', 'lounge', 'umbrella']
2022-03-17 04:40:17,174.174 2829:trainer.py:487 do_train_dict(): eta: 8:06:03  iter: 49600  speed: 276.7 images/sec  total_norm: 149.4679 (150.0831)  loss: 138.9617 (140.1336)  masked_loss: 1.3869 (1.4121)  tag_loss: 137.5309 (138.7215)  time: 1.4320 (1.8506)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4267 (1.8455)  save_time: 8.8805 (16.0781)  lr: 0.000025  max mem: 26307
2022-03-17 04:40:17,530.530 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6756756901741028
2022-03-17 04:40:17,531.531 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 158.37367248535156
2022-03-17 04:40:17,531.531 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.39898772978447
2022-03-17 04:40:42,473.473 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022680258378386497
2022-03-17 04:40:42,473.473 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:40:42,474.474 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'cooked', 'dish', 'of', 'some', 'kind', 'sitting', 'on', 'a', '[MASK]', 'with', 'two', 'bowls', 'of', 'fresh', 'vegetables', 'next', 'to', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:40:42,489.489 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bowl', 'table', 'carrot', '[UNK]', 'food', 'plate', 'tomato', 'salad', 'bowls', 'design', 'dish', 'pepper', 'stem', 'vegetable', 'strawberry', 'spoon', 'orange', 'fork', 'meat', 'rice', 'fruit', 'flower', 'container', 'onion', 'cheese', 'chicken', 'different', 'bean', 'handle', 'glass', 'potato', 'other', 'lemon', 'cloth', 'white', 'wooden', 'bread', 'mushroom', 'leaf', 'shrimp', 'slice', 'pasta', 'full', 'next', 'napkin', 'sauce', 'meal', 'corn', 'olive', 'egg']
2022-03-17 04:40:58,496.496 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'kind', 'table', 'food', 'bowl', 'fresh', 'stem', 'dish', 'lid', 'cooked', 'carrot']
03-17 04:43:15.869 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 04:43:15.869 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 04:43:16.916 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 04:43:22,100.100 2829:trainer.py:487 do_train_dict(): eta: 8:03:14  iter: 49700  speed: 276.9 images/sec  total_norm: 148.5189 (151.0028)  loss: 140.0317 (140.8793)  masked_loss: 1.4249 (1.4468)  tag_loss: 138.8981 (139.4326)  time: 1.4312 (1.8493)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4263 (1.8442)  save_time: 8.8805 (16.0781)  lr: 0.000025  max mem: 26307
2022-03-17 04:43:22,462.462 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-17 04:43:22,462.462 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 103.23800659179688
2022-03-17 04:43:22,462.462 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.41179775712959
2022-03-17 04:43:47,255.255 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022702042013406754
2022-03-17 04:43:47,255.255 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:43:47,256.256 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'this', '[MASK]', '[MASK]', 'picture', 'if', 'elephant', 'swimming', 'in', 'the', 'lake', 'waters', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:43:47,271.271 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['elephant', 'ear', 'trunk', 'leg', 'tree', 'water', 'grass', 'head', '[UNK]', 'forest', 'mouth', 'eye', 'bush', 'body', 'branch', 'tail', 'wood', 'foot', 'walking', 'river', 'large', 'plant', 'leaf', 'standing', 'face', 'next', 'back', 'animal', 'shore', 'ripple', 'hair', 'flower', 'area', 'tongue', 'field', 'bird', 'young', 'big', 'couple', 'shirt', 'reflection', 'hand', 'splash', 'tall', 'wild', 'rock', 'fish', 'bank', 'background', 'short']
2022-03-17 04:44:03,223.223 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'water', 'mouth', 'lake', 'forest', 'eye', 'tree', 'picture', 'leg', 'ear', 'grass', 'tail', 'swimming', 'bush', 'trunk', 'elephant']
2022-03-17 04:46:27,024.024 2829:trainer.py:487 do_train_dict(): eta: 8:00:24  iter: 49800  speed: 276.9 images/sec  total_norm: 149.1315 (154.4250)  loss: 139.0368 (140.5827)  masked_loss: 1.4775 (1.4722)  tag_loss: 137.3105 (139.1106)  time: 1.4321 (1.8489)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4268 (1.8437)  save_time: 8.8805 (16.0781)  lr: 0.000025  max mem: 26307
2022-03-17 04:46:27,385.385 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5151515007019043
2022-03-17 04:46:27,385.385 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 138.45236206054688
2022-03-17 04:46:27,385.385 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.41740675106315
2022-03-17 04:46:52,360.360 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02271447703242302
2022-03-17 04:46:52,360.360 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:46:52,360.360 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'is', 'holding', '[MASK]', 'hand', 'up', 'by', 'the', 'stop', 'sign', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:46:52,376.376 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sign', 'sky', 'letter', 'pole', 'stop', 'street', 'bolt', 'screw', 'tree', '[UNK]', 'wire', 'line', 'bracket', 'arrow', 'post', 'word', 'band', 'red', 'branch', 'top', 'number', 'green', 'intersection', 'antenna', 'front', 'blue', 'writing', 'string', 'strap', 'road', 'close', 'building', 'white', 'background', 'logo', 'light', 'border', 'shadow', 'head', 'cloud', 'stripe', 'back', 'next', 'ring', 'design', 'face', 'wall', 'language', 'metal', 'leaf']
2022-03-17 04:47:08,344.344 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'stop', 'ring', 'letter', 'sign', 'sky', 'finger', 'shadow', 'palm', 'pole', 'thumb', 'arrow', 'bolt', 'screw']
2022-03-17 04:49:32,159.159 2829:trainer.py:487 do_train_dict(): eta: 7:57:35  iter: 49900  speed: 276.6 images/sec  total_norm: 150.7205 (151.1491)  loss: 140.5784 (142.9568)  masked_loss: 1.4407 (1.4690)  tag_loss: 138.7918 (141.4877)  time: 1.4319 (1.8517)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4266 (1.8462)  save_time: 8.8805 (16.0781)  lr: 0.000025  max mem: 26307
2022-03-17 04:49:32,524.524 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6285714507102966
2022-03-17 04:49:32,524.524 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 101.52361297607422
2022-03-17 04:49:32,525.525 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.42669495391846
2022-03-17 04:49:57,553.553 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022757606580853462
2022-03-17 04:49:57,553.553 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:49:57,553.553 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'long', '##tidae', 'train', 'pulling', 'into', '[MASK]', 'station', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:49:57,569.569 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['train', 'track', 'window', 'sky', 'roof', 'front', 'station', '[UNK]', 'pole', 'platform', 'car', 'windshield', 'mountain', 'light', 'engine', 'building', 'sign', 'stripe', 'number', 'person', 'box', 'hill', 'door', 'tree', 'line', 'red', 'bumper', 'man', 'beam', 'wall', 'post', 'wire', 'grass', 'ground', 'sidewalk', 'passenger', 'pillar', 'bush', 'long', 'large', 'next', 'letter', 'shelter', 'gravel', 'bench', 'tower', 'traffic', 'fence', 'logo', 'white']
2022-03-17 04:50:13,525.525 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'long', 'number', 'line', 'station', 'building', 'front', 'red', 'light', 'car', 'track', 'person', 'wall', 'hill', 'mountain', 'engine', 'window', 'train', 'sign', 'sky', 'platform', 'roof', 'pole', 'rack', 'stripe']
2022-03-17 04:52:37,494.494 2829:trainer.py:487 do_train_dict(): eta: 7:54:45  iter: 50000  speed: 276.3 images/sec  total_norm: 149.1054 (150.0296)  loss: 138.9571 (140.2438)  masked_loss: 1.3931 (1.4389)  tag_loss: 137.3763 (138.8049)  time: 1.4329 (1.8534)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4277 (1.8482)  save_time: 8.8805 (16.0781)  lr: 0.000025  max mem: 26307
2022-03-17 04:52:37,496.496 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt
2022-03-17 04:52:46,593.593 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5555555820465088
2022-03-17 04:52:46,593.593 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 133.10516357421875
2022-03-17 04:52:46,593.593 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.43397687200063
2022-03-17 04:53:11,901.901 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02277860790491104
2022-03-17 04:53:11,902.902 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:53:11,902.902 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'small', 'teddy', 'bear', '[MASK]', 'in', 'the', 'fore', '##ground', '[MASK]', 'people', 'walk', 'down', 'a', 'sidewalk', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:53:11,917.917 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'hair', 'shirt', 'window', '[UNK]', 'woman', 'person', 'sidewalk', 'fence', 'jacket', 'eye', 'tree', 'man', 'hand', 'head', 'arm', 'mouth', 'nose', 'bag', 'railing', 'sky', 'ear', 'purse', 'face', 'backpack', 'street', 'wall', 'sign', 'strap', 'sleeve', 'food', 'lady', 'sweater', 'tongue', 'paper', 'finger', 'neck', 'pole', 'glasses', 'girl', 'road', 'light', 'car', 'bear', 'shoulder', 'hat', 'dog', 'roof', 'animal', 'boy']
2022-03-17 04:53:27,798.798 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'face', 'small', 'line', 'building', 'road', 'street', 'woman', 'hair', 'person', 'arm', 'eye', 'window', 'tree', 'sign', 'sky', 'shirt', 'bus', 'animal', 'nose', 'bag', 'bear', 'hat', 'cap', 'pole', 'fence', 'purse', 'teddy', 'sidewalk', 'sweater', 'railing']
2022-03-17 04:55:50,701.701 2829:trainer.py:487 do_train_dict(): eta: 7:51:58  iter: 50100  speed: 265.0 images/sec  total_norm: 148.4082 (149.6757)  loss: 139.4005 (141.2265)  masked_loss: 1.4070 (1.4382)  tag_loss: 137.9250 (139.7883)  time: 1.4330 (1.9320)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4276 (1.8396)  save_time: 8.8421 (15.3432)  lr: 0.000025  max mem: 26307
2022-03-17 04:55:51,062.062 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-17 04:55:51,063.063 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 129.89891052246094
2022-03-17 04:55:51,063.063 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.44018154220277
2022-03-17 04:56:15,986.986 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02279139868915081
2022-03-17 04:56:15,987.987 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:56:15,987.987 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'young', '[MASK]', 'staring', '[MASK]', 'the', 'camera', 'sticking', 'his', 'nose', 'in', 'between', 'the', 'handles', 'of', 'a', '##don', 'of', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:56:16,003.003 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['scissors', 'eye', 'face', 'handle', 'ear', 'eyebrow', '[UNK]', 'hand', 'screw', 'person', 'nose', 'lip', 'finger', 'hair', 'boy', 'nail', 'cheek', 'mouth', 'head', 'arm', 'bolt', 'glasses', 'man', 'blade', 'woman', 'pair', 'wall', 'forehead', 'shadow', 'girl', 'piece', 'thumb', 'red', 'string', 'background', 'shoulder', 'knot', 'shirt', 'hole', 'button', 'hat', 'design', 'young', 'reflection', 'close', 'brush', 'neck', 'metal', 'star', 'child']
2022-03-17 04:56:31,885.885 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'old', 'face', 'young', 'hair', 'mouth', 'arm', 'boy', 'eye', 'metal', 'piece', 'pair', 'finger', 'nose', 'ear', 'camera', 'handle', 'cheek', 'shadow', 'lip', 'forehead', 'eyebrow', 'screw', 'scissors']
2022-03-17 04:58:56,230.230 2829:trainer.py:487 do_train_dict(): eta: 7:49:08  iter: 50200  speed: 276.0 images/sec  total_norm: 147.6705 (150.4382)  loss: 141.8247 (142.1822)  masked_loss: 1.4507 (1.4878)  tag_loss: 140.2529 (140.6944)  time: 1.4336 (1.8553)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4284 (1.8501)  save_time: 8.8421 (15.3432)  lr: 0.000024  max mem: 26307
2022-03-17 04:58:56,592.592 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7222222089767456
2022-03-17 04:58:56,593.593 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 130.3173828125
2022-03-17 04:58:56,593.593 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.45479534444942
2022-03-17 04:59:22,049.049 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022828172892332077
2022-03-17 04:59:22,050.050 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 04:59:22,050.050 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'dressed', 'in', 'business', 'casual', '[MASK]', 'is', 'alone', 'in', 'a', 'inflated', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 04:59:22,065.065 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tie', 'face', 'wall', 'man', 'shirt', 'glasses', 'nose', 'ceiling', 'hair', 'mouth', 'head', 'ear', 'eye', '[UNK]', 'arm', 'room', 'hand', 'light', 'window', 'floor', 'sleeve', 'table', 'knot', 'collar', 'jacket', 'belt', 'chair', 'neck', 'door', 'blind', 'picture', 'finger', 'suit', 'pillow', 'black', 'couch', 'switch', 'cabinet', 'young', 'lamp', 'camera', 'white', 'box', 'person', 'mirror', 'wrist', 'bag', 'bottle', 'board', 'drawer']
2022-03-17 04:59:38,052.052 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'face', 'room', 'light', 'business', 'board', 'hair', 'mouth', 'floor', 'wall', 'arm', 'eye', 'neck', 'window', 'box', 'jean', 'shirt', 'nose', 'tie', 'ceiling', 'glasses', 'casual', 'sleeve', 'cord', 'attire']
2022-03-17 05:02:01,552.552 2829:trainer.py:487 do_train_dict(): eta: 7:46:18  iter: 50300  speed: 276.3 images/sec  total_norm: 146.8121 (147.7029)  loss: 136.5942 (139.1961)  masked_loss: 1.4032 (1.4419)  tag_loss: 135.4883 (137.7542)  time: 1.4329 (1.8532)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4276 (1.8480)  save_time: 8.8421 (15.3432)  lr: 0.000024  max mem: 26307
2022-03-17 05:02:01,913.913 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.53125
2022-03-17 05:02:01,913.913 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 149.7757568359375
2022-03-17 05:02:01,913.913 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.45914998887078
2022-03-17 05:02:27,186.186 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022834159433841705
2022-03-17 05:02:27,187.187 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:02:27,187.187 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'young', 'boys', 'are', 'preparing', 'for', 'a', '[MASK]', 'at', 'a', '[MASK]', 'game', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:02:27,202.202 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['jersey', 'fence', 'helmet', 'player', 'glove', 'shoe', 'person', '[UNK]', 'number', 'shirt', 'ground', 'leg', 'man', 'arm', 'hand', 'belt', 'field', 'uniform', 'baseball', 'head', 'pole', 'dirt', 'sock', 'back', 'bat', 'batter', 'photo', 'spectator', 'line', 'hat', 'game', 'boy', 'plate', 'ball', 'home', 'cap', 'knee', 'background', 'umpire', 'logo', 'name', 'catcher', 'foot', 'grass', 'white', 'pad', 'sign', 'base', 'young', 'watch']
2022-03-17 05:02:43,198.198 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'back', 'head', 'man', 'hand', 'number', 'game', 'play', 'young', 'player', 'field', 'ground', 'person', 'boy', 'baseball', 'ball', 'shirt', 'jersey', 'leg', 'belt', 'hat', 'cap', 'uniform', 'pole', 'dirt', 'fence', 'helmet', 'shoe', 'glove', 'batter', 'sock']
2022-03-17 05:05:06,648.648 2829:trainer.py:487 do_train_dict(): eta: 7:43:28  iter: 50400  speed: 276.6 images/sec  total_norm: 149.3395 (150.5801)  loss: 135.7944 (138.4608)  masked_loss: 1.4548 (1.4752)  tag_loss: 134.3376 (136.9856)  time: 1.4320 (1.8510)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4269 (1.8458)  save_time: 8.8421 (15.3432)  lr: 0.000024  max mem: 26307
2022-03-17 05:05:07,011.011 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-17 05:05:07,011.011 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 130.78053283691406
2022-03-17 05:05:07,011.011 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.4602456347777
2022-03-17 05:05:32,250.250 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022814041003584862
2022-03-17 05:05:32,250.250 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:05:32,250.250 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'surf', '##boards', 'are', '[MASK]', 'up', 'on', 'the', 'beach', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:05:32,266.266 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'beach', 'person', 'cloud', 'sand', 'woman', 'building', 'man', 'bikini', 'child', 'girl', '[UNK]', 'boat', 'shirt', 'stripe', 'short', 'boy', 'water', 'hair', 'suit', 'bathing', 'shadow', 'bottom', 'chair', 'ocean', 'board', 'fin', 'top', 'towel', 'hand', 'hat', 'wave', 'blue', 'umbrella', 'surf', 'dress', 'house', 'flag', 'group', 'writing', 'head', 'window', 'logo', 'reflection', 'foot', 'family', 'sandy', 'name', 'couple', 'dog']
2022-03-17 05:05:48,238.238 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'family', 'man', 'group', 'water', 'building', 'top', 'woman', 'short', 'board', 'hair', 'girl', 'person', 'child', 'boy', 'writing', 'beach', 'sign', 'sky', 'shirt', 'dress', 'suit', 'tank', 'flag', 'hat', 'cloud', 'cap', 'pole', 'logo', 'tent', 'banner', 'towel', 'fin', 'bathing', 'bikini']
2022-03-17 05:08:12,113.113 2829:trainer.py:487 do_train_dict(): eta: 7:40:38  iter: 50500  speed: 276.1 images/sec  total_norm: 149.1124 (150.4144)  loss: 137.7408 (138.6420)  masked_loss: 1.4749 (1.4566)  tag_loss: 136.2069 (137.1855)  time: 1.4326 (1.8547)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4274 (1.8494)  save_time: 8.8421 (15.3432)  lr: 0.000024  max mem: 26307
2022-03-17 05:08:12,474.474 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7058823704719543
2022-03-17 05:08:12,474.474 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 98.47399139404297
2022-03-17 05:08:12,474.474 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.4736145155232
2022-03-17 05:08:37,994.994 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02281448245048523
2022-03-17 05:08:37,994.994 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:08:37,995.995 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'boys', 'are', 'running', 'after', '[MASK]', 'soccer', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:08:38,010.010 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'hand', 'short', 'shirt', 'ball', 'boy', 'sock', 'field', 'soccer', 'hair', 'shoe', 'logo', 'head', 'leg', 'jersey', 'shadow', 'arm', 'flower', 'young', 'uniform', 'face', 'man', 'background', 'stripe', '[UNK]', 'fence', 'ground', 'mouth', 'person', 'sleeve', 'knee', 'nose', 'child', 'pole', 'eye', 'bush', 'ear', 'tree', 'player', 'vest', 'girl', 'foot', 'game', 'collar', 'jacket', 'green', 'car', 'number', 'star', 'design']
2022-03-17 05:08:53,892.892 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'short', 'field', 'hair', 'boy', 'eye', 'ball', 'shirt', 'jersey', 'nose', 'soccer', 'shadow', 'grass', 'flower', 'logo', 'boot', 'shoe', 'sunglasses', 'sock']
2022-03-17 05:11:17,689.689 2829:trainer.py:487 do_train_dict(): eta: 7:37:48  iter: 50600  speed: 275.9 images/sec  total_norm: 150.5359 (151.9252)  loss: 140.0559 (140.1738)  masked_loss: 1.4527 (1.4399)  tag_loss: 138.5333 (138.7340)  time: 1.4320 (1.8557)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4270 (1.8507)  save_time: 8.8421 (15.3432)  lr: 0.000024  max mem: 26307
2022-03-17 05:11:18,050.050 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5833333134651184
2022-03-17 05:11:18,050.050 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 142.77249145507812
2022-03-17 05:11:18,050.050 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.47939107168826
2022-03-17 05:11:43,495.495 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022816665470600128
2022-03-17 05:11:43,495.495 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:11:43,496.496 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'close', 'up', 'of', 'a', 'clock', '[MASK]', '[MASK]', 'a', 'steep', '##le', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:11:43,511.511 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['clock', 'hand', 'tower', 'sky', 'building', 'window', 'cross', 'cloud', '[UNK]', 'roof', 'face', 'top', 'wall', 'large', 'blue', 'tall', 'big', 'weather', 'number', 'design', 'vane', 'hour', 'arch', 'spire', 'day', 'cloudy', 'ornate', 'white', 'side', 'gold', 'high', 'base', 'star', 'red', 'tree', 'column', 'city', 'middle', 'picture', 'finger', 'pole', 'bottom', 'wire', 'roman', 'beautiful', 'green', 'brick', 'front', 'view', 'pillar']
2022-03-17 05:11:59,379.379 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'face', 'building', 'close', 'cross', 'window', 'tower', 'sky', 'roof', 'clock']
03-17 05:13:16.976 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 05:13:16.976 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 05:13:18.044 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 94}]
2022-03-17 05:14:23,125.125 2829:trainer.py:487 do_train_dict(): eta: 7:34:58  iter: 50700  speed: 276.1 images/sec  total_norm: 147.4799 (149.7552)  loss: 141.0554 (140.8234)  masked_loss: 1.4834 (1.5086)  tag_loss: 139.4084 (139.3148)  time: 1.4326 (1.8543)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4274 (1.8492)  save_time: 8.8421 (15.3432)  lr: 0.000024  max mem: 26307
2022-03-17 05:14:23,487.487 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-17 05:14:23,487.487 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 132.40927124023438
2022-03-17 05:14:23,487.487 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.48357393610196
2022-03-17 05:14:49,193.193 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02281193435192108
2022-03-17 05:14:49,193.193 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:14:49,193.193 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'blue', 'beach', 'vernacular', 'under', 'a', '[MASK]', 'umbrella', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:14:49,209.209 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'water', 'beach', 'ocean', 'sand', 'umbrella', 'wave', 'pole', 'chair', 'cloud', 'boat', 'shore', 'yellow', 'leg', 'shadow', '[UNK]', 'footprint', 'lawn', 'back', 'top', 'sandy', 'blue', 'towel', 'rock', 'mountain', 'day', 'person', 'cushion', 'lounge', 'empty', 'table', 'body', 'colorful', 'sun', 'area', 'next', 'sunny', 'arm', 'scene', 'bucket', 'handle', 'board', 'front', 'grass', 'couple', 'post', 'seat', 'large', 'background', 'red']
2022-03-17 05:15:05,034.034 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'water', 'top', 'rock', 'blue', 'chair', 'beach', 'sky', 'yellow', 'ocean', 'leg', 'wave', 'shore', 'sand', 'cloud', 'pole', 'umbrella']
2022-03-17 05:17:28,742.742 2829:trainer.py:487 do_train_dict(): eta: 7:32:08  iter: 50800  speed: 275.8 images/sec  total_norm: 147.6092 (152.0821)  loss: 141.1256 (140.7496)  masked_loss: 1.4150 (1.3789)  tag_loss: 139.6001 (139.3707)  time: 1.4323 (1.8562)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4271 (1.8510)  save_time: 8.8421 (15.3432)  lr: 0.000024  max mem: 26307
2022-03-17 05:17:29,104.104 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7878788113594055
2022-03-17 05:17:29,104.104 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 154.2214813232422
2022-03-17 05:17:29,104.104 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.47611718468207
2022-03-17 05:17:54,736.736 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022800426930189133
2022-03-17 05:17:54,736.736 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:17:54,736.736 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'people', 'posing', 'for', '[MASK]', 'photograph', 'at', '[MASK]', 'black', 'tie', 'event', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:17:54,752.752 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['woman', 'hat', 'hand', 'man', 'hair', 'shirt', 'tree', 'person', 'necklace', 'head', '[UNK]', 'tie', 'girl', 'dress', 'sunglasses', 'short', 'boy', 'face', 'ground', 'group', 'suit', 'shoe', 'glasses', 'camera', 'baby', 'bracelet', 'bag', 'phone', 'bottle', 'purse', 'flower', 'child', 'helmet', 'watch', 'leg', 'sky', 'young', 'jacket', 'boot', 'top', 'sock', 'nose', 'other', 'fence', 'glass', 'eye', 'skirt', 'ring', 'cap', 'bench']
2022-03-17 05:18:10,722.722 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'group', 'hand', 'face', 'black', 'woman', 'cup', 'short', 'hair', 'girl', 'person', 'event', 'boy', 'glass', 'baby', 'tree', 'watch', 'shirt', 'picture', 'dress', 'suit', 'tie', 'bottle', 'hat', 'photograph', 'shoe', 'necklace', 'sunglasses', 'groom']
2022-03-17 05:20:34,279.279 2829:trainer.py:487 do_train_dict(): eta: 7:29:18  iter: 50900  speed: 276.0 images/sec  total_norm: 149.1933 (153.1358)  loss: 139.8029 (142.4593)  masked_loss: 1.4123 (1.4768)  tag_loss: 137.6314 (140.9825)  time: 1.4326 (1.8554)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4278 (1.8502)  save_time: 8.8421 (15.3432)  lr: 0.000023  max mem: 26307
2022-03-17 05:20:34,641.641 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-17 05:20:34,641.641 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 155.60215759277344
2022-03-17 05:20:34,642.642 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.47513413522758
2022-03-17 05:21:00,160.160 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022831393405795097
2022-03-17 05:21:00,161.161 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:21:00,161.161 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'some', 'bicycle', 'riders', 'are', '[MASK]', 'a', 'street', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:21:00,177.177 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['line', 'bicycle', 'road', 'bike', 'street', 'man', 'shirt', 'person', 'pole', 'tree', 'sidewalk', 'wheel', 'sky', '[UNK]', 'building', 'sign', 'tire', 'window', 'short', 'helmet', 'light', 'jacket', 'woman', 'shoe', 'car', 'curb', 'hand', 'backpack', 'head', 'hair', 'traffic', 'house', 'jean', 'fence', 'city', 'bag', 'wall', 'stop', 'boy', 'roof', 'grass', 'bush', 'truck', 'fire', 'arrow', 'cloud', 'bus', 'leg', 'arm', 'vest']
2022-03-17 05:21:16,092.092 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'house', 'line', 'building', 'road', 'street', 'light', 'car', 'person', 'tree', 'sky', 'shirt', 'traffic', 'roof', 'wheel', 'grass', 'bush', 'pole', 'jacket', 'bike', 'fence', 'bicycle', 'helmet', 'sidewalk', 'tire', 'backpack', 'curb', 'chimney', 'vest', 'hedge', 'biker']
2022-03-17 05:23:39,930.930 2829:trainer.py:487 do_train_dict(): eta: 7:26:28  iter: 51000  speed: 275.8 images/sec  total_norm: 148.4104 (150.0822)  loss: 138.0547 (139.5933)  masked_loss: 1.4620 (1.5098)  tag_loss: 136.9248 (138.0835)  time: 1.4325 (1.8565)  data: 0.0001 (0.0005)  to_device: 0.0050 (0.0049)  time_gpu: 1.4273 (1.8511)  save_time: 8.8421 (15.3432)  lr: 0.000023  max mem: 26307
2022-03-17 05:23:40,291.291 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5555555820465088
2022-03-17 05:23:40,291.291 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 154.33363342285156
2022-03-17 05:23:40,291.291 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.47277390840236
2022-03-17 05:24:05,973.973 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022862110286951065
2022-03-17 05:24:05,974.974 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:24:05,974.974 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'zoo', 'enclosure', 'containing', '[MASK]', '##raf', '##fe', '##s', 'and', 'os', '##tric', '##hes', '[MASK]', 'the', '[MASK]', 'exhibit', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:24:05,989.989 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'grass', 'neck', 'fence', '[UNK]', 'rock', 'tail', 'shadow', 'leg', 'head', 'sky', 'bird', 'animal', 'field', 'ground', 'trunk', 'building', 'cow', 'sheep', 'zebra', 'zoo', 'feather', 'beak', 'post', 'grassy', 'horn', 'branch', 'hill', 'ear', 'bush', 'large', 'group', 'wing', 'area', 'spot', 'baby', 'goose', 'park', 'water', 'foot', 'enclosure', 'front', 'top', 'next', 'pole', 'boulder', 'stone', 'face', 'roof', 'standing']
2022-03-17 05:24:21,929.929 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'same', 'ground', 'rock', 'neck', 'tree', 'leg', 'shadow', 'grass', 'tail', 'trunk', 'exhibit', 'fence', 'zoo', 'enclosure']
2022-03-17 05:26:45,850.850 2829:trainer.py:487 do_train_dict(): eta: 7:23:38  iter: 51100  speed: 275.4 images/sec  total_norm: 147.6986 (148.4872)  loss: 141.5769 (142.8459)  masked_loss: 1.4637 (1.5038)  tag_loss: 140.3699 (141.3421)  time: 1.4332 (1.8592)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4281 (1.8540)  save_time: 8.8421 (15.3432)  lr: 0.000023  max mem: 26307
2022-03-17 05:26:46,209.209 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.65625
2022-03-17 05:26:46,210.210 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 159.4515380859375
2022-03-17 05:26:46,210.210 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.46929217129946
2022-03-17 05:27:12,076.076 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02284674160182476
2022-03-17 05:27:12,076.076 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:27:12,077.077 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'portuguese', 'her', 'horse', 'and', 'aleksandr', 'in', 'a', 'fence', '##d', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:27:12,092.092 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'fence', 'dog', 'ear', 'leg', 'head', 'hair', 'girl', 'tail', 'field', 'horse', 'shirt', 'mane', '[UNK]', 'flower', 'post', 'woman', 'leash', 'hand', 'child', 'chain', 'face', 'arm', 'dress', 'neck', 'rope', 'person', 'black', 'mouth', 'brown', 'glasses', 'tongue', 'watch', 'back', 'collar', 'nose', 'tree', 'couple', 'short', 'eye', 'harness', 'foot', 'next', 'lady', 'wire', 'spot', 'pony', 'top', 'body', 'fur']
2022-03-17 05:27:28,041.041 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'back', 'head', 'hand', 'woman', 'field', 'hair', 'girl', 'post', 'arm', 'horse', 'shirt', 'dog', 'leg', 'bag', 'ear', 'grass', 'tail', 'rope', 'fence', 'boot']
2022-03-17 05:29:51,538.538 2829:trainer.py:487 do_train_dict(): eta: 7:20:48  iter: 51200  speed: 275.7 images/sec  total_norm: 149.6668 (151.3736)  loss: 140.0519 (139.6391)  masked_loss: 1.3956 (1.4176)  tag_loss: 138.5114 (138.2215)  time: 1.4318 (1.8569)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4268 (1.8518)  save_time: 8.8421 (15.3432)  lr: 0.000023  max mem: 26307
2022-03-17 05:29:51,899.899 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4571428596973419
2022-03-17 05:29:51,899.899 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 151.3666534423828
2022-03-17 05:29:51,899.899 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.47377428348534
2022-03-17 05:30:17,620.620 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022847270593047142
2022-03-17 05:30:17,621.621 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:30:17,621.621 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'is', 'walking', 'in', 'a', '[MASK]', 'with', 'an', 'umbrella', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:30:17,636.636 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'umbrella', 'tree', 'person', 'street', 'ground', 'building', 'road', '[UNK]', 'leg', 'pole', 'man', 'truck', 'cloud', 'coat', 'car', 'sidewalk', 'fence', 'wheel', 'bag', 'foot', 'shoe', 'light', 'photo', 'bus', 'sign', 'jacket', 'tire', 'wire', 'shadow', 'woman', 'line', 'rain', 'van', 'boot', 'background', 'black', 'head', 'couple', 'roof', 'purse', 'rainy', 'white', 'picture', 'crane', 'wall', 'hand', 'lamp', 'post', 'brick']
2022-03-17 05:30:33,639.639 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'road', 'street', 'light', 'ground', 'person', 'arm', 'base', 'van', 'foot', 'tree', 'sign', 'sky', 'bus', 'leg', 'bag', 'truck', 'billboard', 'wheel', 'coat', 'monument', 'cloud', 'statue', 'photo', 'pole', 'trailer', 'tent', 'courtyard', 'umbrella']
2022-03-17 05:32:57,299.299 2829:trainer.py:487 do_train_dict(): eta: 7:17:57  iter: 51300  speed: 275.6 images/sec  total_norm: 147.9342 (150.8931)  loss: 142.3930 (144.0219)  masked_loss: 1.4029 (1.4181)  tag_loss: 140.9289 (142.6038)  time: 1.4327 (1.8576)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4277 (1.8525)  save_time: 8.8421 (15.3432)  lr: 0.000023  max mem: 26307
2022-03-17 05:32:57,660.660 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4848484992980957
2022-03-17 05:32:57,660.660 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.57009887695312
2022-03-17 05:32:57,660.660 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.47661365701994
2022-03-17 05:33:23,353.353 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0228738933801651
2022-03-17 05:33:23,355.355 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:33:23,355.355 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'very', 'tall', 'building', 'with', 'a', 'clock', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:33:23,371.371 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'tower', 'building', 'moon', 'window', 'clock', 'roof', 'wall', 'cross', 'top', '[UNK]', 'tree', 'light', 'spire', 'blue', 'pole', 'hand', 'cloud', 'tall', 'arch', 'background', 'church', 'bird', 'view', 'large', 'clear', 'street', 'statue', 'structure', 'flag', 'brick', 'city', 'vane', 'night', 'snow', 'archway', 'chimney', 'fence', 'door', 'house', 'doorway', 'weather', 'front', 'distance', 'branch', 'red', 'railing', 'person', 'day', 'leaf']
2022-03-17 05:33:39,314.314 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['building', 'wall', 'cross', 'window', 'tower', 'sky', 'tall', 'moon', 'roof', 'clock', 'pillar']
2022-03-17 05:36:03,064.064 2829:trainer.py:487 do_train_dict(): eta: 7:15:07  iter: 51400  speed: 275.6 images/sec  total_norm: 147.7725 (150.5090)  loss: 141.1639 (140.6418)  masked_loss: 1.4593 (1.4557)  tag_loss: 139.9064 (139.1860)  time: 1.4317 (1.8577)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4263 (1.8525)  save_time: 8.8421 (15.3432)  lr: 0.000023  max mem: 26307
2022-03-17 05:36:03,425.425 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6285714507102966
2022-03-17 05:36:03,426.426 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 132.674072265625
2022-03-17 05:36:03,426.426 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.47598486594784
2022-03-17 05:36:29,253.253 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0229461919516325
2022-03-17 05:36:29,253.253 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:36:29,254.254 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'riding', 'ski', '##s', 'across', 'a', 'snow', 'covered', 'slope', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:36:29,269.269 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'tree', 'snow', 'jacket', 'ski', 'man', 'pole', 'ground', 'fence', 'head', 'boot', 'glove', 'face', 'coat', 'glasses', 'sky', 'hand', 'leg', 'foot', 'person', 'building', 'trunk', 'skier', 'poles', 'hair', 'shoe', 'hat', 'background', 'snowy', 'house', 'track', 'hill', 'slope', 'helmet', 'arm', 'top', 'strap', 'backpack', 'hood', 'cloud', 'woman', 'mountain', 'stick', 'post', 'roof', 'black', 'next', 'bench', 'sign', 'guy']
2022-03-17 05:36:45,168.168 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'house', 'hand', 'face', 'building', 'ground', 'hair', 'tree', 'sign', 'snow', 'coat', 'pole', 'jacket', 'hood', 'glasses', 'ski', 'fence', 'boot', 'slope', 'glove']
2022-03-17 05:39:08,972.972 2829:trainer.py:487 do_train_dict(): eta: 7:12:17  iter: 51500  speed: 275.4 images/sec  total_norm: 148.2207 (150.6410)  loss: 141.6054 (140.1483)  masked_loss: 1.4545 (1.4760)  tag_loss: 139.9645 (138.6723)  time: 1.4320 (1.8591)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4270 (1.8539)  save_time: 8.8421 (15.3432)  lr: 0.000022  max mem: 26307
2022-03-17 05:39:09,336.336 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-17 05:39:09,336.336 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 126.5184555053711
2022-03-17 05:39:09,336.336 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.48144597046135
2022-03-17 05:39:35,358.358 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022993769496679306
2022-03-17 05:39:35,358.358 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:39:35,359.359 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'two', 'bikes', 'crosses', 'the', 'road', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:39:35,374.374 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'person', 'bicycle', 'bike', 'shirt', '[UNK]', 'building', 'street', 'cone', 'hair', 'vest', 'chain', 'hat', 'road', 'line', 'shoe', 'stripe', 'tire', 'jacket', 'hand', 'sign', 'window', 'arm', 'head', 'tree', 'pole', 'woman', 'background', 'wheel', 'traffic', 'collar', 'cap', 'logo', 'jean', 'motorcycle', 'barrier', 'sky', 'backpack', 'camera', 'light', 'letter', 'mirror', 'city', 'basket', 'wall', 'back', 'barrel', 'bag', 'crowd', 'group']
2022-03-17 05:39:51,336.336 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'line', 'building', 'road', 'street', 'hair', 'person', 'wall', 'window', 'sign', 'shirt', 'chain', 'wheel', 'hat', 'cap', 'bike', 'logo', 'fence', 'collar', 'bicycle', 'shoe', 'tire', 'cone', 'pedal', 'vest', 'stripe']
2022-03-17 05:42:15,114.114 2829:trainer.py:487 do_train_dict(): eta: 7:09:26  iter: 51600  speed: 275.1 images/sec  total_norm: 148.3121 (151.9691)  loss: 141.3788 (142.1773)  masked_loss: 1.4018 (1.4124)  tag_loss: 139.9413 (140.7649)  time: 1.4323 (1.8614)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4271 (1.8563)  save_time: 8.8421 (15.3432)  lr: 0.000022  max mem: 26307
2022-03-17 05:42:15,475.475 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7878788113594055
2022-03-17 05:42:15,475.475 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 151.30194091796875
2022-03-17 05:42:15,475.475 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.48433929531902
2022-03-17 05:42:41,580.580 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022996308282017708
2022-03-17 05:42:41,581.581 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:42:41,581.581 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'young', 'and', 'old', 'person', 'are', 'playing', '[MASK]', 'games', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:42:41,596.596 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['glasses', 'hand', 'hair', 'woman', 'shirt', 'wall', 'face', 'window', 'arm', 'sweater', '[UNK]', 'controller', 'jean', 'remote', 'room', 'head', 'lady', 'game', 'couch', 'floor', 'blind', 'mouth', 'table', 'nose', 'ceiling', 'chair', 'leg', 'pillow', 'girl', 'watch', 'door', 'shelf', 'ear', 'curtain', 'sofa', 'box', 'wii', 'light', 'book', 'wrist', 'bottle', 'picture', 'handle', 'bag', 'video', 'cabinet', 'plant', 'shoe', 'person', 'belt']
2022-03-17 05:42:57,584.584 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'old', 'face', 'room', 'young', 'woman', 'cup', 'living', 'hair', 'girl', 'video', 'person', 'table', 'wall', 'arm', 'boy', 'chair', 'window', 'watch', 'shirt', 'kid', 'ottoman', 'blind', 'couch', 'glasses', 'skirt', 'pillow', 'sofa', 'sweater', 'cushion']
03-17 05:43:18.123 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 05:43:18.123 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 05:43:19.494 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 05:45:20,986.986 2829:trainer.py:487 do_train_dict(): eta: 7:06:36  iter: 51700  speed: 275.5 images/sec  total_norm: 147.2572 (150.6359)  loss: 139.5363 (139.7787)  masked_loss: 1.4213 (1.4252)  tag_loss: 138.2551 (138.3536)  time: 1.4312 (1.8587)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4262 (1.8535)  save_time: 8.8421 (15.3432)  lr: 0.000022  max mem: 26307
2022-03-17 05:45:21,347.347 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5142857432365417
2022-03-17 05:45:21,348.348 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 135.24557495117188
2022-03-17 05:45:21,348.348 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.48552903429422
2022-03-17 05:45:47,344.344 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022976456210017204
2022-03-17 05:45:47,345.345 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:45:47,345.345 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'number', 'of', 'colorful', 'kite', '[MASK]', 'flying', 'under', 'a', 'ramps', 'sky', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:45:47,360.360 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['kite', 'sky', 'man', 'shirt', 'person', 'hair', 'string', 'flag', 'head', 'woman', '[UNK]', 'grass', 'field', 'tent', 'sunglasses', 'jacket', 'crowd', 'ground', 'building', 'tail', 'hat', 'face', 'pole', 'bicycle', 'air', 'ear', 'number', 'park', 'jean', 'glasses', 'tree', 'eye', 'hand', 'child', 'group', 'fence', 'balloon', 'bike', 'large', 'coat', 'cloud', 'beach', 'shadow', 'bag', 'wall', 'street', 'arm', 'sign', 'hill', 'boy']
2022-03-17 05:46:03,346.346 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'number', 'hair', 'blue', 'person', 'tree', 'sky', 'shirt', 'crowd', 'string', 'jacket', 'bike', 'colorful', 'kite']
2022-03-17 05:48:27,042.042 2829:trainer.py:487 do_train_dict(): eta: 7:03:45  iter: 51800  speed: 275.2 images/sec  total_norm: 147.3880 (150.1413)  loss: 139.3582 (140.6168)  masked_loss: 1.4529 (1.4726)  tag_loss: 138.3924 (139.1443)  time: 1.4310 (1.8605)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4257 (1.8554)  save_time: 8.8421 (15.3432)  lr: 0.000022  max mem: 26307
2022-03-17 05:48:27,403.403 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-17 05:48:27,403.403 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 110.99067687988281
2022-03-17 05:48:27,403.403 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.49655274113708
2022-03-17 05:48:53,571.571 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0229787640273571
2022-03-17 05:48:53,572.572 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:48:53,572.572 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bush', 'filled', 'with', '[MASK]', 'of', 'purple', 'flowers', 'near', 'water', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:48:53,587.587 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['flower', 'sky', 'water', 'plant', 'bench', 'tree', 'hill', 'grass', 'cloud', 'field', 'bush', 'building', 'person', '[UNK]', 'city', 'boat', 'background', 'bridge', 'pole', 'garden', 'lake', 'park', 'shirt', 'sidewalk', 'front', 'ground', 'large', 'tower', 'river', 'blue', 'next', 'top', 'trunk', 'man', 'bicycle', 'dirt', 'head', 'woman', 'post', 'fence', 'green', 'house', 'branch', 'light', 'lamp', 'leaf', 'boy', 'pot', 'umbrella', 'road']
2022-03-17 05:49:09,480.480 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'water', 'building', 'field', 'hill', 'plant', 'train', 'tree', 'sky', 'shirt', 'wheel', 'grass', 'bush', 'cloud', 'purple', 'flower', 'bench', 'bike', 'bicycle']
2022-03-17 05:51:33,098.098 2829:trainer.py:487 do_train_dict(): eta: 7:00:55  iter: 51900  speed: 275.2 images/sec  total_norm: 148.6713 (154.2171)  loss: 140.7731 (140.5863)  masked_loss: 1.4384 (1.4408)  tag_loss: 139.3752 (139.1456)  time: 1.4319 (1.8606)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4269 (1.8555)  save_time: 8.8421 (15.3432)  lr: 0.000022  max mem: 26307
2022-03-17 05:51:33,460.460 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7941176295280457
2022-03-17 05:51:33,460.460 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 129.7630615234375
2022-03-17 05:51:33,460.460 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.49959303782536
2022-03-17 05:51:59,600.600 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022981395944952965
2022-03-17 05:51:59,600.600 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:51:59,601.601 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'dirty', 'white', 'toilet', 'filled', 'with', 'beverage', 'containers', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:51:59,616.616 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['toilet', 'floor', 'water', 'bowl', 'wall', 'lid', 'seat', 'bathroom', 'tile', 'head', '[UNK]', 'arm', 'hair', 'handle', 'shirt', 'hand', 'tail', 'small', 'line', 'cap', 'baby', 'tank', 'ear', 'toy', 'short', 'child', 'paper', 'beak', 'man', 'trash', 'can', 'person', 'metal', 'shoe', 'white', 'foot', 'boy', 'ground', 'animal', 'pipe', 'body', 'leg', 'bear', 'brush', 'bird', 'top', 'glove', 'towel', 'cup', 'green']
2022-03-17 05:52:15,566.566 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['water', 'body', 'white', 'floor', 'wall', 'seat', 'bowl', 'bottle', 'cap', 'dirty', 'pole', 'shoe', 'toilet', 'lid', 'beverage']
2022-03-17 05:54:39,196.196 2829:trainer.py:487 do_train_dict(): eta: 6:58:04  iter: 52000  speed: 275.1 images/sec  total_norm: 148.8579 (152.1922)  loss: 140.9233 (140.0620)  masked_loss: 1.4459 (1.4548)  tag_loss: 139.4069 (138.6072)  time: 1.4319 (1.8610)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4266 (1.8558)  save_time: 8.8421 (15.3432)  lr: 0.000022  max mem: 26307
2022-03-17 05:54:39,557.557 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-17 05:54:39,557.557 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.58297729492188
2022-03-17 05:54:39,557.557 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.49819077182411
2022-03-17 05:55:05,623.623 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02299141138792038
2022-03-17 05:55:05,623.623 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:55:05,624.624 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'objects', 'are', 'sitting', 'on', 'top', 'of', 'a', 'glass', 'table', '.', '##yx', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:55:05,639.639 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'hand', 'can', 'man', 'couch', 'floor', 'shirt', 'shoe', 'hair', 'leg', 'person', 'rug', 'table', 'head', 'carpet', 'glasses', 'wall', 'face', 'bag', 'arm', 'chair', 'bottle', 'jean', 'phone', 'trash', 'ear', 'jacket', 'nose', 'cap', 'watch', 'glass', 'remote', 'blanket', 'mat', 'room', 'box', 'cup', 'screen', 'wheel', 'sofa', 'book', 'boy', 'beer', 'finger', 'door', 'cell', 'controller', 'shelf', 'soda', 'mouth']
2022-03-17 05:55:21,634.634 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'man', 'hand', 'top', 'control', 'person', 'floor', 'table', 'wall', 'phone', 'glass', 'box', 'cell', 'cd', 'jean', 'shirt', 'label', 'bag', 'bowl', 'beer', 'bottle', 'cap', 'couch', 'remote', 'shoe', 'candle', 'jar']
2022-03-17 05:57:45,465.465 2829:trainer.py:487 do_train_dict(): eta: 6:55:13  iter: 52100  speed: 274.9 images/sec  total_norm: 149.3275 (152.0727)  loss: 139.6746 (140.1645)  masked_loss: 1.4134 (1.4525)  tag_loss: 138.1430 (138.7120)  time: 1.4317 (1.8627)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4268 (1.8572)  save_time: 8.8421 (15.3432)  lr: 0.000022  max mem: 26307
2022-03-17 05:57:45,826.826 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-17 05:57:45,826.826 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.1970672607422
2022-03-17 05:57:45,826.826 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.49746650754264
2022-03-17 05:58:12,149.149 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.022987600415945053
2022-03-17 05:58:12,150.150 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 05:58:12,150.150 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'three', '[MASK]', '[MASK]', 'ski', '##s', 'are', 'standing', 'on', 'a', 'slope', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 05:58:12,165.165 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['jacket', '[UNK]', 'sky', 'pole', 'ski', 'glove', 'snow', 'person', 'head', 'coat', 'hand', 'hat', 'woman', 'man', 'skier', 'boot', 'ground', 'zipper', 'slope', 'face', 'helmet', 'hair', 'outfit', 'leg', 'poles', 'arm', 'couple', 'sunglasses', 'cap', 'snowy', 'hill', 'foot', 'scarf', 'top', 'group', 'girl', 'hood', 'shadow', 'shoe', 'shirt', 'other', 'backpack', 'orange', 'tree', 'glasses', 'line', 'side', 'skiing', 'female', 'day']
2022-03-17 05:58:28,162.162 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'woman', 'ground', 'hair', 'person', 'sky', 'snow', 'wheel', 'coat', 'hat', 'pole', 'jacket', 'ski', 'boot', 'slope', 'helmet', 'glove', 'skier']
2022-03-17 06:00:51,661.661 2829:trainer.py:487 do_train_dict(): eta: 6:52:22  iter: 52200  speed: 275.0 images/sec  total_norm: 145.5194 (147.7424)  loss: 136.4378 (137.5636)  masked_loss: 1.4403 (1.4753)  tag_loss: 135.0829 (136.0882)  time: 1.4325 (1.8619)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4274 (1.8567)  save_time: 8.8421 (15.3432)  lr: 0.000021  max mem: 26307
2022-03-17 06:00:52,020.020 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-17 06:00:52,020.020 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 147.0973358154297
2022-03-17 06:00:52,021.021 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.50326824735274
2022-03-17 06:01:18,552.552 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023031720891594887
2022-03-17 06:01:18,553.553 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:01:18,553.553 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'person', 'is', 'doing', 'something', 'that', '[MASK]', 'very', 'interesting', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:01:18,568.568 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'grass', 'ear', 'sheep', 'fence', 'hair', 'sky', 'arm', 'boy', 'wall', 'tree', 'head', 'gravel', 'leg', 'stump', 'door', 'ground', 'shadow', 'animal', 'field', 'log', 'post', 'sleeve', 'face', 'barn', '[UNK]', 'background', 'hand', 'rock', 'building', 'young', 'child', 'tail', 'wool', 'cloud', 'lamb', 'car', 'road', 'person', 'gate', 'small', 'wood', 'dirt', 'camera', 'pole', 'goat', 'baby', 'grazing', 'little', 'dog']
2022-03-17 06:01:34,630.630 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'something', 'door', 'ground', 'hair', 'post', 'person', 'wall', 'arm', 'boy', 'tree', 'sky', 'shirt', 'animal', 'leg', 'ear', 'shadow', 'grass', 'tail', 'interesting', 'sheep', 'fence', 'log', 'elbow', 'sleeve', 'gravel', 'stump']
2022-03-17 06:03:57,977.977 2829:trainer.py:487 do_train_dict(): eta: 6:49:32  iter: 52300  speed: 274.8 images/sec  total_norm: 148.7513 (150.3584)  loss: 138.9591 (140.4302)  masked_loss: 1.4191 (1.4284)  tag_loss: 137.8493 (139.0018)  time: 1.4328 (1.8631)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4278 (1.8581)  save_time: 8.8421 (15.3432)  lr: 0.000021  max mem: 26307
2022-03-17 06:03:58,342.342 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-17 06:03:58,342.342 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 183.23245239257812
2022-03-17 06:03:58,342.342 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.49200868242569
2022-03-17 06:04:24,737.737 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023032614961266518
2022-03-17 06:04:24,738.738 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:04:24,738.738 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bunch', 'of', '[MASK]', 'animals', 'stacked', 'on', 'top', '[MASK]', 'each', 'other', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:04:24,753.753 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bear', 'window', 'nose', 'teddy', 'building', 'head', 'ear', 'sign', 'wall', 'eye', 'sidewalk', 'shirt', 'handle', '[UNK]', 'motorcycle', 'ground', 'ribbon', 'store', 'bike', 'door', 'curb', 'face', 'uniform', 'chair', 'man', 'hair', 'animal', 'wheel', 'light', 'leg', 'arm', 'hat', 'floor', 'person', 'bow', 'mouth', 'blue', 'car', 'tag', 'street', 'stuffed', 'jacket', 'bag', 'road', 'tire', 'bat', 'logo', 'hand', 'stripe', 'foot']
2022-03-17 06:04:40,691.691 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['other', 'head', 'hand', 'face', 'building', 'top', 'book', 'mouth', 'wall', 'seat', 'arm', 'smile', 'eye', 'chair', 'foot', 'window', 'box', 'sign', 'shirt', 'teeth', 'animal', 'nose', 'ear', 'bear', 'uniform', 'tag', 'bat', 'patch', 'bunch', 'monkey', 'doll', 'teddy', 'stuffed', 'lid']
2022-03-17 06:07:04,118.118 2829:trainer.py:487 do_train_dict(): eta: 6:46:41  iter: 52400  speed: 275.1 images/sec  total_norm: 148.5133 (151.8342)  loss: 139.5278 (140.1417)  masked_loss: 1.3528 (1.4331)  tag_loss: 137.5445 (138.7086)  time: 1.4311 (1.8614)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4259 (1.8562)  save_time: 8.8421 (15.3432)  lr: 0.000021  max mem: 26307
2022-03-17 06:07:04,479.479 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4722222089767456
2022-03-17 06:07:04,480.480 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 142.93185424804688
2022-03-17 06:07:04,480.480 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.49530140468052
2022-03-17 06:07:30,901.901 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02307848446071148
2022-03-17 06:07:30,901.901 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:07:30,902.902 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'bear', 'is', 'on', 'the', 'pillow', 'and', 'a', 'jacket', 'is', '[MASK]', '[MASK]', 'bed', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:07:30,917.917 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bed', 'window', 'pillow', 'blanket', 'bear', 'blind', 'room', 'teddy', 'ear', 'sheet', 'tree', 'wall', 'bedroom', 'head', '[UNK]', 'lamp', 'clothes', 'animal', 'shirt', 'stuffed', 'curtain', 'nightstand', 'foot', 'book', 'shade', 'arm', 'person', 'light', 'cat', 'nose', 'floor', 'large', 'table', 'leg', 'top', 'jacket', 'clock', 'chair', 'next', 'post', 'picture', 'tail', 'couple', 'paw', 'small', 'paper', 'dresser', 'white', 'cover', 'building']
2022-03-17 06:07:46,828.828 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['room', 'bed', 'window', 'tree', 'shirt', 'clothes', 'ear', 'bear', 'blind', 'jacket', 'blanket', 'pillow', 'lamp', 'teddy']
2022-03-17 06:10:10,588.588 2829:trainer.py:487 do_train_dict(): eta: 6:43:50  iter: 52500  speed: 274.6 images/sec  total_norm: 150.9259 (152.7547)  loss: 138.8904 (139.2738)  masked_loss: 1.4213 (1.4351)  tag_loss: 137.1731 (137.8387)  time: 1.4336 (1.8647)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4286 (1.8596)  save_time: 8.8421 (15.3432)  lr: 0.000021  max mem: 26307
2022-03-17 06:10:10,949.949 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7272727489471436
2022-03-17 06:10:10,949.949 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 127.15536499023438
2022-03-17 06:10:10,949.949 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.50250674200602
2022-03-17 06:10:37,455.455 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023070955649018288
2022-03-17 06:10:37,455.455 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:10:37,455.455 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'young', 'people', 'are', 'crossing', 'the', '[MASK]', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:10:37,471.471 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sign', 'light', 'tree', 'street', 'pole', 'car', 'sky', 'building', 'cone', 'sidewalk', 'road', 'person', 'traffic', '[UNK]', 'line', 'city', 'man', 'can', 'window', 'store', 'shirt', 'truck', 'fire', 'woman', 'shadow', 'arrow', 'curb', 'trash', 'intersection', 'van', 'flag', 'mountain', 'box', 'bag', 'bench', 'fence', 'jacket', 'stop', 'jean', 'wall', 'barrier', 'balcony', 'lamp', 'suv', 'roof', 'booth', 'corner', 'letter', 'cover', 'cart']
2022-03-17 06:10:53,403.403 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'city', 'man', 'group', 'line', 'building', 'road', 'street', 'young', 'light', 'woman', 'car', 'hair', 'person', 'window', 'tree', 'store', 'sign', 'sky', 'jean', 'shirt', 'bus', 'traffic', 'bag', 'hat', 'pole', 'purse', 'balcony', 'cone']
2022-03-17 06:13:17,095.095 2829:trainer.py:487 do_train_dict(): eta: 6:40:59  iter: 52600  speed: 274.5 images/sec  total_norm: 149.9317 (151.5256)  loss: 140.9038 (140.4717)  masked_loss: 1.4318 (1.4907)  tag_loss: 139.1198 (138.9810)  time: 1.4329 (1.8651)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4279 (1.8600)  save_time: 8.8421 (15.3432)  lr: 0.000021  max mem: 26307
2022-03-17 06:13:17,455.455 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-17 06:13:17,455.455 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 132.94970703125
2022-03-17 06:13:17,456.456 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.50741239109799
03-17 06:13:19.595 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 06:13:19.595 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 06:13:20.315 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}]
2022-03-17 06:13:43,694.694 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02310938946902752
2022-03-17 06:13:43,695.695 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:13:43,695.695 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'small', 'tropical', 'umbrella', 'sits', 'on', '[MASK]', 'patch', 'of', 'grass', 'at', 'the', 'beach', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:13:43,710.710 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sand', 'umbrella', 'beach', 'wave', 'shadow', 'ground', 'rug', 'flower', 'towel', 'person', 'water', 'butterfly', '[UNK]', 'grass', 'design', 'pole', 'carpet', 'shore', 'leaf', 'sun', 'ball', 'background', 'handle', 'rock', 'mat', 'green', 'bird', 'sky', 'light', 'colorful', 'post', 'ocean', 'cloud', 'object', 'sandy', 'patch', 'moss', 'snow', 'puddle', 'little', 'logo', 'open', 'wing', 'inside', 'man', 'piece', 'next', 'body', 'top', 'footprint']
2022-03-17 06:13:59,715.715 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['small', 'ground', 'person', 'beach', 'sky', 'background', 'wave', 'tropical', 'shadow', 'sand', 'grass', 'flower', 'patch', 'towel', 'umbrella', 'rug']
2022-03-17 06:16:23,595.595 2829:trainer.py:487 do_train_dict(): eta: 6:38:08  iter: 52700  speed: 274.5 images/sec  total_norm: 149.2330 (151.5919)  loss: 142.3542 (142.5090)  masked_loss: 1.4248 (1.4977)  tag_loss: 140.8760 (141.0114)  time: 1.4324 (1.8651)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4271 (1.8598)  save_time: 8.8421 (15.3432)  lr: 0.000021  max mem: 26307
2022-03-17 06:16:23,955.955 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6857143044471741
2022-03-17 06:16:23,956.956 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 130.64234924316406
2022-03-17 06:16:23,956.956 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.51453000126463
2022-03-17 06:16:50,185.185 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023102670907974243
2022-03-17 06:16:50,186.186 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:16:50,186.186 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'boy', 'with', 'a', 'uno', '##pen', '##ed', 'tooth', '##brush', 'in', 'his', 'mouth', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:16:50,201.201 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'ear', 'nose', 'boy', 'eye', 'hand', 'hair', 'mouth', 'head', '[UNK]', 'face', 'wall', 'writing', 'door', 'sleeve', 'arm', 'handle', 'frame', 'brush', 'logo', 'design', 'letter', 'tooth', 'finger', 'child', 'table', 'light', 'doorway', 'picture', 'baby', 'curtain', 'young', 'window', 'background', 'box', 'lettering', 'small', 'short', 'shelf', 'button', 'room', 'word', 'little', 'blind', 'blue', 'bottle', 'kid', 'bag', 'pajamas', 'toy']
2022-03-17 06:17:06,153.153 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'head', 'hand', 'little', 'face', 'top', 'hair', 'mouth', 'wall', 'arm', 'boy', 'writing', 'eye', 'letter', 'shirt', 'nose', 'ear', 'logo', 'sleeve', 'container', 'microphone']
2022-03-17 06:19:31,827.827 2829:trainer.py:487 do_train_dict(): eta: 6:35:18  iter: 52800  speed: 272.0 images/sec  total_norm: 149.3551 (151.0412)  loss: 141.7490 (141.3010)  masked_loss: 1.4350 (1.4581)  tag_loss: 140.5589 (139.8429)  time: 1.4330 (1.8822)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4278 (1.8770)  save_time: 8.8421 (15.3432)  lr: 0.000020  max mem: 26307
2022-03-17 06:19:32,188.188 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6388888955116272
2022-03-17 06:19:32,188.188 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.81195068359375
2022-03-17 06:19:32,188.188 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.51842562958514
2022-03-17 06:19:58,497.497 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023098180070519447
2022-03-17 06:19:58,497.497 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:19:58,497.497 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'tennis', 'player', 'trying', 'to', 'hit', 'the', 'ball', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:19:58,513.513 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'line', '[UNK]', 'court', 'tennis', 'hand', 'man', 'shoe', 'short', 'arm', 'head', 'leg', 'sock', 'hair', 'ball', 'player', 'net', 'logo', 'shadow', 'ground', 'person', 'pole', 'sign', 'letter', 'knee', 'banner', 'handle', 'face', 'air', 'blue', 'cap', 'band', 'chair', 'white', 'foot', 'stand', 'uniform', 'wall', 'hat', 'male', 'stripe', 'match', 'string', 'beard', 'outfit', 'sleeve', 'top', 'fence', 'writing', 'bag']
2022-03-17 06:20:14,438.438 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'player', 'court', 'short', 'hair', 'post', 'arm', 'ball', 'shirt', 'leg', 'handle', 'tennis', 'shadow', 'net', 'pole', 'shoe', 'sock']
2022-03-17 06:22:38,402.402 2829:trainer.py:487 do_train_dict(): eta: 6:32:27  iter: 52900  speed: 274.4 images/sec  total_norm: 148.2379 (153.0615)  loss: 135.4847 (137.7019)  masked_loss: 1.4718 (1.4906)  tag_loss: 133.8302 (136.2114)  time: 1.4328 (1.8658)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4276 (1.8606)  save_time: 8.8421 (15.3432)  lr: 0.000020  max mem: 26307
2022-03-17 06:22:38,763.763 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-17 06:22:38,764.764 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 145.037353515625
2022-03-17 06:22:38,764.764 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.52613963720934
2022-03-17 06:23:05,421.421 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023095954209566116
2022-03-17 06:23:05,421.421 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:23:05,421.421 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'picture', 'of', 'some', '[MASK]', 'sitting', 'down', 'at', '[MASK]', 'table', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:23:05,437.437 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', 'hair', 'shirt', 'hand', 'chair', 'box', 'window', 'napkin', 'girl', 'cup', 'wall', '[UNK]', 'paper', 'head', 'lid', 'watch', 'woman', 'plate', 'food', 'face', 'glasses', 'bag', 'container', 'arm', 'sandwich', 'necklace', 'coffee', 'tray', 'fork', 'sunglasses', 'boy', 'eye', 'ear', 'nose', 'phone', 'tissue', 'restaurant', 'mouth', 'child', 'purse', 'book', 'floor', 'hamburger', 'top', 'glass', 'bread', 'door', 'person', 'bottle', 'jar']
2022-03-17 06:23:21,403.403 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'book', 'woman', 'cup', 'hair', 'girl', 'mouth', 'child', 'table', 'wall', 'arm', 'eye', 'chair', 'paper', 'window', 'watch', 'box', 'jean', 'shirt', 'picture', 'bag', 'toy', 'shoe', 'container', 'tray', 'lid', 'jar', 'bunny', 'napkin']
2022-03-17 06:25:45,020.020 2829:trainer.py:487 do_train_dict(): eta: 6:29:36  iter: 53000  speed: 274.4 images/sec  total_norm: 147.5545 (150.3079)  loss: 139.4126 (139.8593)  masked_loss: 1.4222 (1.4393)  tag_loss: 138.0947 (138.4200)  time: 1.4325 (1.8661)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4274 (1.8610)  save_time: 8.8421 (15.3432)  lr: 0.000020  max mem: 26307
2022-03-17 06:25:45,381.381 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7575757503509521
2022-03-17 06:25:45,381.381 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 164.57278442382812
2022-03-17 06:25:45,381.381 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.52162080610314
2022-03-17 06:26:11,898.898 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02310321480035782
2022-03-17 06:26:11,899.899 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:26:11,899.899 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'gi', '##raf', '##fe', '[MASK]', '[MASK]', 'on', 'a', 'green', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:26:11,914.914 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'grass', 'sky', '[UNK]', 'leg', 'log', 'neck', 'shadow', 'bush', 'zoo', 'head', 'ground', 'trunk', 'enclosure', 'rock', 'tail', 'pole', 'fence', 'field', 'group', 'branch', 'animal', 'stump', 'wood', 'green', 'area', 'dirt', 'park', 'wall', 'post', 'grassy', 'lush', 'next', 'herd', 'flower', 'water', 'hair', 'mane', 'zebra', 'other', 'large', 'sunny', 'person', 'plant', 'couple', 'hay', 'horn', 'basket', 'open', 'spot']
2022-03-17 06:26:27,799.799 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'group', 'field', 'ground', 'rock', 'green', 'neck', 'tree', 'wood', 'sky', 'leg', 'shadow', 'grass', 'tail', 'bush', 'trunk', 'log', 'zoo', 'enclosure', 'stump']
2022-03-17 06:28:51,661.661 2829:trainer.py:487 do_train_dict(): eta: 6:26:44  iter: 53100  speed: 274.3 images/sec  total_norm: 149.5856 (152.1965)  loss: 135.7643 (138.6308)  masked_loss: 1.4620 (1.4428)  tag_loss: 134.8812 (137.1880)  time: 1.4317 (1.8664)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4268 (1.8612)  save_time: 8.8421 (15.3432)  lr: 0.000020  max mem: 26307
2022-03-17 06:28:52,022.022 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-17 06:28:52,023.023 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 159.529052734375
2022-03-17 06:28:52,023.023 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.51662395592022
2022-03-17 06:29:18,827.827 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023088660091161728
2022-03-17 06:29:18,827.827 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:29:18,828.828 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', '[MASK]', 'teddy', 'bears', 'are', 'sitting', 'together', 'on', 'the', 'red', 'table', '##cloth', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:29:18,843.843 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['foot', 'bear', 'teddy', 'head', 'nose', 'arm', 'eye', 'ear', 'paw', 'leg', 'stuffed', 'table', 'face', '[UNK]', 'towel', 'shirt', 'box', 'floor', 'wall', 'bench', 'toy', 'book', 'ground', 'mat', 'chair', 'toe', 'animal', 'doll', 'ball', 'cloth', 'reflection', 'bow', 'paper', 'ribbon', 'hand', 'bag', 'stripe', 'pillow', 'sock', 'hat', 'napkin', 'tree', 'tie', 'plant', 'container', 'shoe', 'hair', 'cushion', 'white', 'man']
2022-03-17 06:29:34,882.882 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'red', 'rock', 'seat', 'arm', 'eye', 'neck', 'foot', 'window', 'tree', 'shirt', 'leg', 'clothes', 'nose', 'ear', 'bear', 'grass', 'bush', 'bench', 'toy', 'pillow', 'towel', 'ribbon', 'teddy', 'scarf', 'paw']
2022-03-17 06:31:58,707.707 2829:trainer.py:487 do_train_dict(): eta: 6:23:53  iter: 53200  speed: 273.7 images/sec  total_norm: 149.3245 (152.8166)  loss: 141.7967 (143.0419)  masked_loss: 1.3289 (1.3933)  tag_loss: 140.7359 (141.6486)  time: 1.4337 (1.8704)  data: 0.0001 (0.0005)  to_device: 0.0050 (0.0049)  time_gpu: 1.4284 (1.8651)  save_time: 8.8421 (15.3432)  lr: 0.000020  max mem: 26307
2022-03-17 06:31:59,067.067 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-17 06:31:59,068.068 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 162.45693969726562
2022-03-17 06:31:59,068.068 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.51993309579244
2022-03-17 06:32:25,947.947 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023116614669561386
2022-03-17 06:32:25,948.948 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:32:25,948.948 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'black', '[MASK]', '[MASK]', 'outside', 'a', 'wooden', 'door', 'on', 'bricks', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:32:25,963.963 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wheel', 'tire', 'motorcycle', 'wall', 'seat', 'brick', 'engine', 'ground', '[UNK]', 'building', 'floor', 'bike', 'tank', 'tile', 'fender', 'sidewalk', 'light', 'pipe', 'gas', 'door', 'logo', 'exhaust', 'spoke', 'handle', 'plate', 'mirror', 'window', 'plant', 'next', 'sign', 'chain', 'motor', 'front', 'black', 'license', 'weed', 'stand', 'red', 'curtain', 'fence', 'stone', 'pedal', 'pot', 'curb', 'tail', 'shadow', 'garage', 'rim', 'side', 'stain']
2022-03-17 06:32:41,877.877 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'building', 'door', 'light', 'ground', 'rock', 'wall', 'seat', 'stone', 'engine', 'window', 'gas', 'wooden', 'tank', 'plate', 'wheel', 'mirror', 'brick', 'tail', 'pole', 'pipe', 'motorcycle', 'sidewalk', 'tire', 'mat', 'fender']
2022-03-17 06:35:05,450.450 2829:trainer.py:487 do_train_dict(): eta: 6:21:02  iter: 53300  speed: 274.2 images/sec  total_norm: 148.2015 (151.5175)  loss: 138.6734 (139.3751)  masked_loss: 1.4129 (1.4551)  tag_loss: 137.6418 (137.9200)  time: 1.4328 (1.8674)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4276 (1.8624)  save_time: 8.8421 (15.3432)  lr: 0.000020  max mem: 26307
2022-03-17 06:35:05,812.812 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6111111044883728
2022-03-17 06:35:05,813.813 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 164.28720092773438
2022-03-17 06:35:05,813.813 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.51550155096733
2022-03-17 06:35:32,375.375 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023114699870347977
2022-03-17 06:35:32,376.376 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:35:32,376.376 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'on', '[MASK]', 'pair', 'of', 'skies', 'during', '[MASK]', 'competition', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:35:32,391.391 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['boot', 'ski', 'snow', 'pole', '[UNK]', 'glove', 'leg', 'hair', 'ground', 'woman', 'hand', 'head', 'number', 'shirt', 'skier', 'vest', 'face', 'helmet', 'person', 'foot', 'arm', 'suit', 'logo', 'snowy', 'slope', 'girl', 'hat', 'hill', 'jacket', 'tree', 'outfit', 'red', 'sleeve', 'sky', 'letter', 'shin', 'top', 'line', 'man', 'stick', 'ponytail', 'flag', 'sign', 'track', 'skiing', 'pants', 'competitive', 'course', 'guard', 'cap']
2022-03-17 06:35:48,360.360 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'number', 'face', 'woman', 'ground', 'hair', 'competition', 'foot', 'guard', 'shirt', 'pair', 'leg', 'snow', 'pole', 'jacket', 'logo', 'ski', 'boot', 'helmet', 'shin', 'glove', 'vest', 'skier']
2022-03-17 06:38:12,447.447 2829:trainer.py:487 do_train_dict(): eta: 6:18:11  iter: 53400  speed: 273.8 images/sec  total_norm: 149.7664 (153.5950)  loss: 138.5418 (139.5406)  masked_loss: 1.3497 (1.3930)  tag_loss: 137.0514 (138.1477)  time: 1.4332 (1.8700)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4279 (1.8649)  save_time: 8.8421 (15.3432)  lr: 0.000020  max mem: 26307
2022-03-17 06:38:12,808.808 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.44117647409439087
2022-03-17 06:38:12,808.808 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 151.84719848632812
2022-03-17 06:38:12,808.808 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.51551540053893
2022-03-17 06:38:39,758.758 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023107659071683884
2022-03-17 06:38:39,758.758 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:38:39,758.758 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'holding', 'a', '[MASK]', 'bat', '[MASK]', 'children', 'and', 'other', 'adults', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:38:39,774.774 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'jean', 'head', 'boy', 'shirt', 'wall', '[UNK]', 'vest', 'cabinet', 'kitchen', 'jacket', 'bat', 'hand', 'couch', 'table', 'man', 'woman', 'light', 'child', 'chair', 'ear', 'hat', 'girl', 'lamp', 'person', 'handle', 'balloon', 'baseball', 'dress', 'ceiling', 'cup', 'hood', 'door', 'vase', 'shelf', 'face', 'sleeve', 'sink', 'rack', 'towel', 'pillow', 'cloth', 'flower', 'kid', 'bag', 'paper', 'can', 'floor', 'microwave', 'plant']
2022-03-17 06:38:55,650.650 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'other', 'can', 'head', 'man', 'hand', 'door', 'light', 'woman', 'hair', 'girl', 'person', 'table', 'wall', 'boy', 'paper', 'jean', 'shirt', 'kitchen', 'dress', 'handle', 'cabinet', 'hat', 'couch', 'jacket', 'bat', 'hood', 'towel', 'lamp', 'rack', 'fixture', 'vest', 'foam', 'leash']
2022-03-17 06:41:19,659.659 2829:trainer.py:487 do_train_dict(): eta: 6:15:20  iter: 53500  speed: 273.5 images/sec  total_norm: 147.9064 (149.4840)  loss: 135.6014 (137.6662)  masked_loss: 1.3493 (1.4345)  tag_loss: 133.8281 (136.2318)  time: 1.4336 (1.8721)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4283 (1.8669)  save_time: 8.8421 (15.3432)  lr: 0.000019  max mem: 26307
2022-03-17 06:41:20,022.022 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.375
2022-03-17 06:41:20,022.022 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 135.85379028320312
2022-03-17 06:41:20,022.022 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.51670611794316
2022-03-17 06:41:47,055.055 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023115666583180428
2022-03-17 06:41:47,055.055 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:41:47,055.055 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'sidewalk', 'outside', 'muscles', '[MASK]', 'winery', 'with', 'tables', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:41:47,071.071 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'building', 'sign', 'sidewalk', 'wall', 'sky', 'pole', 'car', 'table', 'chair', 'light', 'tree', 'street', 'shadow', '[UNK]', 'door', 'brick', 'curb', 'glass', 'reflection', 'store', 'city', 'restaurant', 'ground', 'bench', 'mat', 'bike', 'post', 'night', 'person', 'line', 'basket', 'bicycle', 'letter', 'base', 'empty', 'side', 'front', 'dirt', 'can', 'tile', 'meter', 'paper', 'parking', 'road', 'pipe', 'trash', 'large', 'patio', 'man']
2022-03-17 06:42:03,008.008 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['name', 'building', 'door', 'light', 'car', 'wall', 'chair', 'window', 'tree', 'sign', 'sky', 'pole', 'sidewalk', 'winery']
03-17 06:43:20.405 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 06:43:20.405 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 06:43:21.369 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 06:44:26,711.711 2829:trainer.py:487 do_train_dict(): eta: 6:12:28  iter: 53600  speed: 273.7 images/sec  total_norm: 148.2527 (151.9876)  loss: 139.1617 (140.3812)  masked_loss: 1.3356 (1.4347)  tag_loss: 137.4768 (138.9464)  time: 1.4322 (1.8705)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4269 (1.8654)  save_time: 8.8421 (15.3432)  lr: 0.000019  max mem: 26307
2022-03-17 06:44:27,070.070 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-17 06:44:27,071.071 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 130.1796875
2022-03-17 06:44:27,071.071 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.51425576520809
2022-03-17 06:44:54,166.166 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023094050586223602
2022-03-17 06:44:54,166.166 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:44:54,166.166 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'large', 'bohemia', 'is', 'attending', 'a', '93', 'game', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:44:54,182.182 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['person', 'stadium', 'shirt', 'stand', '[UNK]', 'man', 'line', 'player', 'wall', 'hat', 'tennis', 'court', 'game', 'sky', 'net', 'sign', 'shoe', 'head', 'field', 'cap', 'grass', 'chair', 'uniform', 'short', 'umpire', 'crowd', 'spectator', 'advertisement', 'fence', 'hair', 'match', 'woman', 'bag', 'logo', 'camera', 'pole', 'leg', 'shadow', 'building', 'arm', 'ball', 'cooler', 'stair', 'banner', 'catcher', 'outfit', 'baseball', 'roof', 'billboard', 'screen']
2022-03-17 06:45:10,179.179 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'game', 'line', 'large', 'player', 'court', 'short', 'field', 'hair', 'person', 'wall', 'stand', 'chair', 'stadium', 'sky', 'shirt', 'audience', 'roof', 'tennis', 'net', 'hat', 'shoe']
2022-03-17 06:47:33,852.852 2829:trainer.py:487 do_train_dict(): eta: 6:09:37  iter: 53700  speed: 273.6 images/sec  total_norm: 148.8559 (150.8806)  loss: 138.2961 (137.3568)  masked_loss: 1.4703 (1.4358)  tag_loss: 137.1160 (135.9210)  time: 1.4334 (1.8714)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4282 (1.8662)  save_time: 8.8421 (15.3432)  lr: 0.000019  max mem: 26307
2022-03-17 06:47:34,215.215 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6875
2022-03-17 06:47:34,216.216 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 114.52116394042969
2022-03-17 06:47:34,216.216 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.52387236839776
2022-03-17 06:48:01,303.303 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023098932579159737
2022-03-17 06:48:01,304.304 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:48:01,304.304 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'this', 'is', 'someone', '##s', 'office', 'inside', '[MASK]', 'their', 'home', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:48:01,320.320 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'wall', 'shelf', 'computer', 'table', 'desk', 'monitor', 'paper', 'book', 'glass', 'mouse', '[UNK]', 'keyboard', 'laptop', 'logo', 'room', 'cord', 'box', 'speaker', 'picture', 'lamp', 'pen', 'painting', 'bottle', 'wire', 'office', 'phone', 'pad', 'screen', 'cd', 'sign', 'door', 'cup', 'can', 'water', 'apple', 'coffee', 'chair', 'light', 'frame', 'stand', 'glasses', 'printer', 'floor', 'reflection', 'bowl', 'coaster', 'handle', 'curtain', 'cabinet']
2022-03-17 06:48:17,312.312 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'home', 'room', 'book', 'office', 'table', 'wall', 'glass', 'paper', 'computer', 'window', 'box', 'picture', 'painting', 'desk', 'cabinet', 'speaker', 'liquid', 'pen', 'wire', 'mouse', 'monitor', 'logo', 'keyboard', 'lamp', 'shelf', 'cord', 'pad', 'laptop', 'printer', 'coaster', 'vase']
2022-03-17 06:50:41,047.047 2829:trainer.py:487 do_train_dict(): eta: 6:06:45  iter: 53800  speed: 273.5 images/sec  total_norm: 148.5597 (150.7829)  loss: 137.6516 (138.9184)  masked_loss: 1.4509 (1.4525)  tag_loss: 136.3108 (137.4659)  time: 1.4347 (1.8719)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4294 (1.8668)  save_time: 8.8421 (15.3432)  lr: 0.000019  max mem: 26307
2022-03-17 06:50:41,408.408 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-17 06:50:41,409.409 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 156.27928161621094
2022-03-17 06:50:41,409.409 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.51999815054417
2022-03-17 06:51:08,269.269 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023086512461304665
2022-03-17 06:51:08,269.269 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:51:08,269.269 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'white', 'plate', 'with', 'the', 'remains', 'of', 'cake', 'and', 'ice', 'cream', 'on', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:51:08,285.285 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', 'plate', 'cake', '[UNK]', 'bowl', 'bread', 'food', 'shadow', 'cream', 'napkin', 'dessert', 'ice', 'chocolate', 'fork', 'meat', 'white', 'crust', 'spoon', 'handle', 'sauce', 'piece', 'paper', 'sandwich', 'glass', 'top', 'eaten', 'knife', 'half', 'slice', 'pastry', 'cup', 'coffee', 'water', 'whipped', 'dish', 'topping', 'desert', 'container', 'design', 'cloth', 'sugar', 'close', 'next', 'object', 'light', 'pie', 'cut', 'small', 'side', 'layer']
2022-03-17 06:51:24,178.178 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'white', 'cup', 'table', 'food', 'ice', 'bowl', 'handle', 'plate', 'shadow', 'cream', 'bread', 'fork', 'cake', 'sauce']
2022-03-17 06:53:48,199.199 2829:trainer.py:487 do_train_dict(): eta: 6:03:54  iter: 53900  speed: 273.6 images/sec  total_norm: 148.3664 (150.3349)  loss: 135.0100 (135.2947)  masked_loss: 1.3557 (1.3908)  tag_loss: 134.3097 (133.9039)  time: 1.4326 (1.8716)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4273 (1.8664)  save_time: 8.8421 (15.3432)  lr: 0.000019  max mem: 26307
2022-03-17 06:53:48,561.561 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5714285969734192
2022-03-17 06:53:48,561.561 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 132.25933837890625
2022-03-17 06:53:48,561.561 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.5328981116966
2022-03-17 06:54:15,532.532 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023074764758348465
2022-03-17 06:54:15,532.532 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:54:15,532.532 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', 'people', 'sand', '##ing', 'around', '[MASK]', 'kitchen', 'having', 'conversation', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:54:15,548.548 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'shirt', '[UNK]', 'kitchen', 'window', 'cabinet', 'woman', 'floor', 'man', 'hand', 'sweater', 'watch', 'jean', 'person', 'girl', 'door', 'apple', 'shelf', 'bowl', 'stove', 'head', 'basket', 'pot', 'bottle', 'wall', 'refrigerator', 'oven', 'bag', 'plate', 'food', 'glasses', 'fruit', 'arm', 'can', 'table', 'bracelet', 'shoe', 'drawer', 'microwave', 'box', 'light', 'picture', 'face', 'sink', 'chair', 'rack', 'towel', 'handle', 'lady', 'cup']
2022-03-17 06:54:31,499.499 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'group', 'hand', 'door', 'light', 'woman', 'hair', 'girl', 'person', 'floor', 'wall', 'food', 'arm', 'lady', 'window', 'watch', 'box', 'shirt', 'kitchen', 'picture', 'dress', 'conversation', 'bag', 'bowl', 'cabinet', 'fan', 'ceiling', 'apple', 'glasses', 'pot', 'boot', 'basket', 'shelf', 'container', 'necklace', 'drawer', 'sweater', 'banana', 'oven', 'refrigerator', 'microwave', 'bracelet']
2022-03-17 06:56:55,586.586 2829:trainer.py:487 do_train_dict(): eta: 6:01:02  iter: 54000  speed: 273.2 images/sec  total_norm: 148.0509 (151.0156)  loss: 138.7195 (139.3287)  masked_loss: 1.3462 (1.3543)  tag_loss: 137.3788 (137.9744)  time: 1.4326 (1.8739)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4273 (1.8687)  save_time: 8.8421 (15.3432)  lr: 0.000019  max mem: 26307
2022-03-17 06:56:55,946.946 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7142857313156128
2022-03-17 06:56:55,947.947 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 156.38815307617188
2022-03-17 06:56:55,947.947 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.52628975831206
2022-03-17 06:57:22,803.803 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023100974038243294
2022-03-17 06:57:22,803.803 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 06:57:22,803.803 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '[MASK]', 'hanging', 'over', 'a', 'city', 'street', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 06:57:22,819.819 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'pole', 'light', 'building', 'tree', 'street', 'car', 'road', 'sign', 'window', 'sidewalk', 'traffic', 'line', 'curb', '[UNK]', 'city', 'person', 'arrow', 'intersection', 'shadow', 'roof', 'post', 'fire', 'store', 'tire', 'corner', 'signal', 'bush', 'lamp', 'green', 'red', 'truck', 'suv', 'bus', 'man', 'flag', 'tail', 'van', 'house', 'box', 'median', 'busy', 'door', 'fence', 'clock', 'wall', 'chimney', 'jacket', 'cover', 'town']
2022-03-17 06:57:38,710.710 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'line', 'building', 'road', 'street', 'light', 'woman', 'car', 'post', 'person', 'window', 'tree', 'box', 'store', 'sign', 'sky', 'jean', 'traffic', 'pole', 'jacket', 'globe', 'lamp', 'sidewalk', 'curb']
2022-03-17 07:00:02,819.819 2829:trainer.py:487 do_train_dict(): eta: 5:58:11  iter: 54100  speed: 273.5 images/sec  total_norm: 148.3124 (148.5386)  loss: 141.6317 (142.7959)  masked_loss: 1.4132 (1.4406)  tag_loss: 140.1342 (141.3553)  time: 1.4324 (1.8723)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4270 (1.8670)  save_time: 8.8421 (15.3432)  lr: 0.000019  max mem: 26307
2022-03-17 07:00:03,184.184 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.4285714328289032
2022-03-17 07:00:03,184.184 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 187.35125732421875
2022-03-17 07:00:03,185.185 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.51709695935689
2022-03-17 07:00:30,438.438 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023073064163327217
2022-03-17 07:00:30,438.438 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:00:30,439.439 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'navy', '[MASK]', 'is', 'moore', '##d', 'in', 'front', 'of', 'woods', '[MASK]', 'a', 'clock', 'tower', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:00:30,454.454 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'water', 'boat', 'sky', 'building', 'bridge', 'tower', 'harbor', 'background', '[UNK]', 'tire', 'person', 'dock', 'flag', 'forest', 'pole', 'window', 'reflection', 'large', 'structure', 'roof', 'sign', 'car', 'stripe', 'mast', 'life', 'river', 'bird', 'palm', 'box', 'man', 'number', 'crane', 'shore', 'light', 'house', 'next', 'bottom', 'body', 'cabin', 'small', 'writing', 'pier', 'wall', 'post', 'wheel', 'other', 'ball', 'line', 'lamp']
2022-03-17 07:00:46,495.495 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'number', 'water', 'building', 'front', 'bridge', 'navy', 'tree', 'tower', 'letter', 'sign', 'sky', 'bottom', 'boat', 'background', 'roof', 'clock', 'flag', 'vessel', 'pole', 'cabin', 'dock', 'tire', 'stair']
2022-03-17 07:03:10,141.141 2829:trainer.py:487 do_train_dict(): eta: 5:55:19  iter: 54200  speed: 273.3 images/sec  total_norm: 149.4554 (151.9599)  loss: 133.5153 (135.4099)  masked_loss: 1.4301 (1.4873)  tag_loss: 132.0351 (133.9226)  time: 1.4322 (1.8732)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4272 (1.8681)  save_time: 8.8421 (15.3432)  lr: 0.000018  max mem: 26307
2022-03-17 07:03:10,500.500 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5625
2022-03-17 07:03:10,500.500 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 137.13644409179688
2022-03-17 07:03:10,500.500 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.52164006101492
2022-03-17 07:03:37,476.476 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023063212633132935
2022-03-17 07:03:37,476.476 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:03:37,477.477 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'couple', 'of', 'people', 'skiing', '[MASK]', 'a', 'snowy', 'slope', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:03:37,492.492 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['ski', 'pole', '[UNK]', 'tree', 'person', 'snow', 'track', 'ground', 'hair', 'jacket', 'leg', 'skier', 'woman', 'backpack', 'branch', 'country', 'cross', 'boot', 'hand', 'hat', 'foot', 'trail', 'hill', 'head', 'snowy', 'sign', 'slope', 'boy', 'arm', 'girl', 'poles', 'coat', 'trunk', 'shirt', 'glove', 'path', 'child', 'sky', 'pine', 'skiing', 'man', 'bush', 'couple', 'side', 'line', 'wood', 'hood', 'way', 'wooded', 'shoe']
2022-03-17 07:03:53,376.376 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'ground', 'hair', 'track', 'person', 'child', 'couple', 'tree', 'sign', 'trail', 'snow', 'coat', 'pole', 'jacket', 'ski', 'slope', 'poles', 'backpack', 'snowy', 'skier']
2022-03-17 07:06:17,399.399 2829:trainer.py:487 do_train_dict(): eta: 5:52:28  iter: 54300  speed: 273.4 images/sec  total_norm: 148.4166 (151.8802)  loss: 136.9527 (139.2305)  masked_loss: 1.4078 (1.4522)  tag_loss: 135.5449 (137.7783)  time: 1.4327 (1.8726)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4275 (1.8671)  save_time: 8.8421 (15.3432)  lr: 0.000018  max mem: 26307
2022-03-17 07:06:17,759.759 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-17 07:06:17,759.759 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 149.07948303222656
2022-03-17 07:06:17,760.760 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.5228221206104
2022-03-17 07:06:44,990.990 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023059379309415817
2022-03-17 07:06:44,991.991 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:06:44,991.991 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'in', 'a', 'vintage', 'banker', 'suit', 'is', 'leaning', 'against', 'a', 'vehicle', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:06:45,007.007 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'button', 'man', 'sleeve', 'vest', 'nose', 'tie', 'head', 'arm', 'face', 'belt', '[UNK]', 'collar', 'mouth', 'hair', 'ear', 'eye', 'suit', 'buckle', 'jacket', 'pocket', 'wall', 'hand', 'window', 'glasses', 'watch', 'phone', 'hat', 'name', 'sunglasses', 'neck', 'knot', 'tag', 'person', 'wrist', 'building', 'background', 'car', 'coat', 'beard', 'sky', 'shadow', 'cell', 'logo', 'black', 'tree', 'curtain', 'cuff', 'grass', 'chin']
2022-03-17 07:07:00,906.906 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'name', 'hand', 'face', 'hair', 'wall', 'arm', 'neck', 'window', 'shirt', 'label', 'picture', 'vehicle', 'nose', 'suit', 'frame', 'handle', 'tie', 'belt', 'blind', 'tag', 'button', 'jacket', 'sleeve', 'banker', 'vintage', 'sunglasses', 'vest', 'mustache']
2022-03-17 07:09:24,628.628 2829:trainer.py:487 do_train_dict(): eta: 5:49:36  iter: 54400  speed: 273.5 images/sec  total_norm: 148.6420 (150.2953)  loss: 137.3252 (139.1413)  masked_loss: 1.4169 (1.4307)  tag_loss: 135.7573 (137.7105)  time: 1.4322 (1.8723)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4270 (1.8671)  save_time: 8.8421 (15.3432)  lr: 0.000018  max mem: 26307
2022-03-17 07:09:24,988.988 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.65625
2022-03-17 07:09:24,988.988 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 140.4403076171875
2022-03-17 07:09:24,988.988 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.52352770184159
2022-03-17 07:09:52,068.068 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02305726893246174
2022-03-17 07:09:52,069.069 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:09:52,069.069 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'glass', 'of', '[MASK]', 'and', 'white', 'wine', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:09:52,084.084 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', 'glass', 'stem', 'wine', 'base', '[UNK]', 'light', 'liquid', 'red', 'bottle', 'wall', 'handle', 'object', 'bowl', 'rim', 'background', 'close', 'next', 'shadow', 'reflection', 'white', 'paper', 'top', 'cup', 'plate', 'counter', 'knife', 'wooden', 'water', 'empty', 'ring', 'bubble', 'flower', 'person', 'small', 'bunch', 'floor', 'surface', 'label', 'full', 'fork', 'food', 'glasses', 'knot', 'napkin', 'view', 'sit', 'blade', 'image', 'other']
2022-03-17 07:10:07,953.953 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['white', 'red', 'cup', 'table', 'glass', 'wine', 'bowl', 'liquid', 'stem']
2022-03-17 07:12:32,327.327 2829:trainer.py:487 do_train_dict(): eta: 5:46:44  iter: 54500  speed: 272.8 images/sec  total_norm: 148.6849 (151.0165)  loss: 137.1681 (138.1163)  masked_loss: 1.4510 (1.4560)  tag_loss: 135.2661 (136.6602)  time: 1.4332 (1.8770)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4281 (1.8718)  save_time: 8.8421 (15.3432)  lr: 0.000018  max mem: 26307
2022-03-17 07:12:32,688.688 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-17 07:12:32,688.688 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 163.749267578125
2022-03-17 07:12:32,688.688 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.52617822025285
2022-03-17 07:13:00,024.024 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023123163729906082
2022-03-17 07:13:00,024.024 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:13:00,025.025 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'holding', 'up', 'holding', 'an', 'object', 'inside', 'of', 'a', 'plastic', '[MASK]', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:13:00,040.040 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'hand', 'shirt', 'head', 'beard', 'nose', 'sign', 'necklace', 'hair', 'wall', 'door', 'building', 'face', 'eye', 'arm', 'mustache', 'neck', 'fence', 'phone', 'chain', 'box', 'window', 'sky', 'letter', 'floor', 'cell', 'ground', 'ear', '[UNK]', 'jean', 'mouth', 'bracelet', 'chair', 'front', 'handle', 'wrist', 'tree', 'facial', 'house', 'post', 'camera', 'collar', 'watch', 'next', 'sleeve', 'picture', 'roof', 'frame', 'table', 'pole']
2022-03-17 07:13:16,055.055 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'head', 'man', 'hand', 'face', 'building', 'door', 'short', 'inside', 'case', 'ground', 'hair', 'mouth', 'floor', 'wall', 'arm', 'phone', 'eye', 'chair', 'neck', 'box', 'letter', 'sign', 'sky', 'shirt', 'nose', 'object', 'plastic', 'fence', 'beard', 'necklace', 'stool', 'mustache']
03-17 07:13:21.469 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 07:13:21.469 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 07:13:22.537 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 07:15:39,560.560 2829:trainer.py:487 do_train_dict(): eta: 5:43:52  iter: 54600  speed: 273.5 images/sec  total_norm: 150.4979 (153.6734)  loss: 138.9823 (139.6637)  masked_loss: 1.3454 (1.3863)  tag_loss: 136.8902 (138.2775)  time: 1.4325 (1.8724)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4274 (1.8671)  save_time: 8.8421 (15.3432)  lr: 0.000018  max mem: 26307
2022-03-17 07:15:39,921.921 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-17 07:15:39,921.921 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 158.3876953125
2022-03-17 07:15:39,922.922 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.53041911430289
2022-03-17 07:16:07,467.467 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02310135029256344
2022-03-17 07:16:07,468.468 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:16:07,468.468 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'passenger', 'train', '[MASK]', 'on', 'train', 'tracks', '##ccus', 'overhead', 'wires', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:16:07,484.484 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['train', 'window', 'track', 'door', 'sky', 'number', 'wire', 'crane', 'wheel', 'building', 'flag', 'light', 'top', 'roof', 'car', 'structure', 'tower', '[UNK]', 'vent', 'sign', 'black', 'logo', 'clock', 'pole', 'silver', 'line', 'cable', 'metal', 'railroad', 'church', 'platform', 'gravel', 'bottom', 'bridge', 'letter', 'old', 'front', 'chimney', 'power', 'person', 'passenger', 'rail', 'antenna', 'man', 'station', 'wall', 'wood', 'step', 'board', 'blue']
2022-03-17 07:16:23,469.469 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['number', 'building', 'top', 'door', 'track', 'window', 'train', 'letter', 'sky', 'bottom', 'passenger', 'clock', 'flag', 'wheel', 'wire', 'logo', 'overhead', 'ladder', 'crane']
2022-03-17 07:18:47,167.167 2829:trainer.py:487 do_train_dict(): eta: 5:41:00  iter: 54700  speed: 272.9 images/sec  total_norm: 150.6016 (153.3920)  loss: 139.6840 (141.2588)  masked_loss: 1.4077 (1.4399)  tag_loss: 138.3964 (139.8189)  time: 1.4327 (1.8761)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4273 (1.8708)  save_time: 8.8421 (15.3432)  lr: 0.000018  max mem: 26307
2022-03-17 07:18:47,528.528 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6857143044471741
2022-03-17 07:18:47,529.529 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 143.51622009277344
2022-03-17 07:18:47,529.529 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.53192005714361
2022-03-17 07:19:15,056.056 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02309594489634037
2022-03-17 07:19:15,057.057 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:19:15,057.057 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'bears', 'climbing', '[MASK]', 'rocks', 'in', '[MASK]', 'snow', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:19:15,073.073 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bear', 'ear', 'nose', 'head', 'ground', 'eye', 'rock', 'leg', 'mouth', 'paw', 'snow', 'brown', 'back', 'face', 'log', 'snout', 'fur', 'wood', 'claw', 'shadow', 'tree', 'post', 'water', 'polar', 'large', 'wall', 'foot', 'grass', 'enclosure', 'pole', 'trunk', 'zoo', 'fence', 'tongue', 'reflection', 'tail', 'neck', 'dirt', 'stone', 'other', 'couple', 'big', 'pen', '[UNK]', 'stick', 'knot', 'next', 'boulder', 'animal', 'furry']
2022-03-17 07:19:31,021.021 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'face', 'ground', 'rock', 'mouth', 'eye', 'nose', 'ear', 'bear', 'snow', 'log', 'moss', 'paw']
2022-03-17 07:21:54,770.770 2829:trainer.py:487 do_train_dict(): eta: 5:38:08  iter: 54800  speed: 272.9 images/sec  total_norm: 148.0623 (151.0993)  loss: 140.7924 (141.4211)  masked_loss: 1.4195 (1.4604)  tag_loss: 139.1667 (139.9607)  time: 1.4329 (1.8760)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4278 (1.8708)  save_time: 8.8421 (15.3432)  lr: 0.000017  max mem: 26307
2022-03-17 07:21:55,132.132 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7142857313156128
2022-03-17 07:21:55,133.133 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 122.77622985839844
2022-03-17 07:21:55,133.133 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.54276910027954
2022-03-17 07:22:22,902.902 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023112192749977112
2022-03-17 07:22:22,903.903 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:22:22,903.903 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', '[MASK]', 'sink', 'mounted', 'to', 'the', 'side', 'of', 'a', 'white', 'wall', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:22:22,919.919 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'sink', 'floor', 'hole', 'base', 'paint', 'bathroom', 'toilet', '[UNK]', 'leaf', 'bowl', 'shelf', 'pipe', 'ceiling', 'painting', 'ground', 'drain', 'plant', 'room', 'dirty', 'basin', 'tile', 'seat', 'window', 'tank', 'dirt', 'stand', 'trash', 'graffiti', 'broken', 'old', 'handle', 'building', 'white', 'line', 'flower', 'leg', 'tree', 'vase', 'lid', 'light', 'pedestal', 'small', 'table', 'tub', 'water', 'sculpture', 'soap', 'stain', 'reflection']
2022-03-17 07:22:38,823.823 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'side', 'white', 'floor', 'wall', 'base', 'bowl', 'hole', 'paint', 'sink', 'shelf', 'toilet']
2022-03-17 07:25:02,468.468 2829:trainer.py:487 do_train_dict(): eta: 5:35:16  iter: 54900  speed: 272.8 images/sec  total_norm: 149.8817 (151.8378)  loss: 137.2733 (137.8993)  masked_loss: 1.3250 (1.3788)  tag_loss: 135.8262 (136.5206)  time: 1.4322 (1.8770)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4269 (1.8718)  save_time: 8.8421 (15.3432)  lr: 0.000017  max mem: 26307
2022-03-17 07:25:02,828.828 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5454545617103577
2022-03-17 07:25:02,829.829 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 124.87173461914062
2022-03-17 07:25:02,829.829 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.5487529893355
2022-03-17 07:25:30,442.442 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023135090246796608
2022-03-17 07:25:30,443.443 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:25:30,443.443 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'person', '[MASK]', 'surfing', 'in', 'a', '[MASK]', 'large', 'body', 'of', 'water', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:25:30,459.459 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['water', 'sky', 'man', 'ocean', 'person', 'kite', '[UNK]', 'boat', 'wave', 'island', 'wake', 'sail', 'rock', 'board', 'hill', 'land', 'body', 'horizon', 'para', 'surfer', 'mountain', 'tree', 'wind', 'large', 'stripe', 'surf', 'surfing', 'day', 'blue', 'distance', 'building', 'head', 'cloud', 'ski', 'paddle', 'sunny', 'group', 'short', 'middle', 'string', 'line', 'bird', 'top', 'object', 'beach', 'leg', 'clear', 'sailing', 'shirt', 'bush']
2022-03-17 07:25:46,394.394 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'water', 'body', 'large', 'island', 'rock', 'person', 'sky', 'boat', 'ocean', 'wave', 'wake', 'horizon', 'kite']
2022-03-17 07:28:10,049.049 2829:trainer.py:487 do_train_dict(): eta: 5:32:24  iter: 55000  speed: 273.0 images/sec  total_norm: 149.1632 (153.8445)  loss: 140.8947 (140.2621)  masked_loss: 1.3745 (1.4348)  tag_loss: 139.3336 (138.8274)  time: 1.4329 (1.8758)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4277 (1.8706)  save_time: 8.8421 (15.3432)  lr: 0.000017  max mem: 26307
2022-03-17 07:28:10,051.051 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt
2022-03-17 07:28:19,121.121 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.800000011920929
2022-03-17 07:28:19,122.122 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 157.20748901367188
2022-03-17 07:28:19,122.122 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.55034812608778
2022-03-17 07:28:46,745.745 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02318420074880123
2022-03-17 07:28:46,745.745 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:28:46,746.746 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'toilet', 'and', 'a', 'sink', 'nazi', 'a', 'small', 'room', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:28:46,761.761 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'floor', 'toilet', 'flower', 'bowl', 'seat', 'cord', 'base', 'design', 'basket', 'plate', 'wire', 'star', 'table', 'lid', '[UNK]', 'shoe', 'bathroom', 'pipe', 'vase', 'paper', 'chair', 'handle', 'hole', 'water', 'bucket', 'ground', 'hose', 'black', 'fireplace', 'decoration', 'bag', 'room', 'sink', 'rope', 'object', 'white', 'rim', 'scissors', 'brush', 'container', 'floral', 'holder', 'mirror', 'person', 'line', 'pot', 'boot', 'strap', 'cup']
2022-03-17 07:29:02,691.691 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['small', 'room', 'floor', 'star', 'table', 'wall', 'seat', 'base', 'bag', 'bowl', 'plate', 'bathroom', 'flower', 'wire', 'sink', 'pipe', 'boot', 'shoe', 'cord', 'toilet', 'bucket']
2022-03-17 07:31:25,594.594 2829:trainer.py:487 do_train_dict(): eta: 5:29:34  iter: 55100  speed: 261.8 images/sec  total_norm: 149.2752 (152.3086)  loss: 139.8544 (139.5519)  masked_loss: 1.4109 (1.4445)  tag_loss: 138.5486 (138.1074)  time: 1.4328 (1.9554)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4277 (1.8633)  save_time: 8.8421 (14.7395)  lr: 0.000017  max mem: 26307
2022-03-17 07:31:25,958.958 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.78125
2022-03-17 07:31:25,958.958 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.60842895507812
2022-03-17 07:31:25,958.958 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.54972153124602
2022-03-17 07:31:53,708.708 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02319576032459736
2022-03-17 07:31:53,708.708 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:31:53,708.708 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'wool', '##ly', 'sheep', 'are', 'in', 'a', 'friedman', 'field', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:31:53,724.724 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sheep', 'grass', 'fence', 'head', 'leg', 'field', 'post', 'ear', 'tree', 'pole', 'wool', 'face', '[UNK]', 'bush', 'green', 'eye', 'nose', 'tail', 'grazing', 'tag', 'building', 'person', 'trunk', 'dog', 'lamb', 'group', 'sky', 'animal', 'fur', 'grassy', 'background', 'herd', 'area', 'lush', 'mouth', 'pasture', 'large', 'standing', 'barn', 'body', 'hill', 'leaf', 'mane', 'white', 'next', 'wire', 'couple', 'coat', 'hay', 'wood']
2022-03-17 07:32:09,685.685 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'field', 'green', 'post', 'tree', 'leg', 'ear', 'grass', 'bush', 'fur', 'pole', 'trunk', 'sheep', 'fence', 'hay', 'wool', 'grazing']
2022-03-17 07:34:33,741.741 2829:trainer.py:487 do_train_dict(): eta: 5:26:42  iter: 55200  speed: 272.1 images/sec  total_norm: 147.0503 (149.8581)  loss: 136.7761 (138.6221)  masked_loss: 1.3926 (1.4238)  tag_loss: 135.2788 (137.1983)  time: 1.4334 (1.8814)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4281 (1.8763)  save_time: 8.8421 (14.7395)  lr: 0.000017  max mem: 26307
2022-03-17 07:34:34,102.102 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.53125
2022-03-17 07:34:34,102.102 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 152.0072479248047
2022-03-17 07:34:34,102.102 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.55092213166773
2022-03-17 07:35:01,761.761 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023187007755041122
2022-03-17 07:35:01,762.762 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:35:01,762.762 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'blue', 'room', 'with', 'an', 'open', 'window', 'and', '[MASK]', 'bed', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:35:01,778.778 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bed', 'window', 'wall', 'bear', 'teddy', 'pillow', '[UNK]', 'bedroom', 'sign', 'room', 'sheet', 'head', 'blanket', 'frame', 'animal', 'ceiling', 'camera', 'curtain', 'light', 'shelf', 'lamp', 'stuffed', 'ear', 'picture', 'flower', 'speaker', 'bow', 'nose', 'knob', 'arm', 'mirror', 'table', 'leaf', 'blind', 'foot', 'clock', 'post', 'fan', 'box', 'white', 'tree', 'floor', 'blue', 'paper', 'rail', 'shirt', 'outlet', 'small', 'toy', 'large']
2022-03-17 07:35:17,732.732 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'room', 'open', 'blue', 'bed', 'wall', 'window', 'sign', 'bear', 'camera', 'ceiling', 'flower', 'sheet', 'pillow', 'curtain', 'shelf', 'teddy', 'paw']
2022-03-17 07:37:41,506.506 2829:trainer.py:487 do_train_dict(): eta: 5:23:50  iter: 55300  speed: 272.7 images/sec  total_norm: 150.5369 (151.4214)  loss: 140.9055 (141.8900)  masked_loss: 1.4811 (1.4752)  tag_loss: 139.4649 (140.4148)  time: 1.4327 (1.8777)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4277 (1.8726)  save_time: 8.8421 (14.7395)  lr: 0.000017  max mem: 26307
2022-03-17 07:37:41,867.867 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7058823704719543
2022-03-17 07:37:41,867.867 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 118.02864074707031
2022-03-17 07:37:41,867.867 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.55853754174409
2022-03-17 07:38:09,565.565 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02322113700211048
2022-03-17 07:38:09,565.565 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:38:09,565.565 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'tennis', 'player', 'stands', 'ready', 'to', 'receive', 'the', 'ball', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:38:09,581.581 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['line', '[UNK]', 'tennis', 'court', 'hand', 'fence', 'woman', 'leg', 'shoe', 'hair', 'ground', 'handle', 'shirt', 'ball', 'head', 'tree', 'girl', 'skirt', 'bush', 'ponytail', 'bracelet', 'arm', 'pole', 'short', 'top', 'logo', 'face', 'player', 'wall', 'sock', 'person', 'young', 'net', 'hat', 'dress', 'grass', 'car', 'female', 'jacket', 'string', 'tank', 'wrist', 'stripe', 'lady', 'sign', 'trunk', 'necklace', 'mouth', 'roof', 'cap']
2022-03-17 07:38:25,556.556 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'line', 'player', 'woman', 'court', 'short', 'ground', 'hair', 'ready', 'tree', 'ball', 'shirt', 'leg', 'handle', 'tennis', 'bush', 'pole', 'fence', 'shoe']
2022-03-17 07:40:49,691.691 2829:trainer.py:487 do_train_dict(): eta: 5:20:58  iter: 55400  speed: 272.1 images/sec  total_norm: 148.6515 (150.2700)  loss: 136.8090 (136.7787)  masked_loss: 1.3725 (1.4105)  tag_loss: 135.4121 (135.3681)  time: 1.4333 (1.8818)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4280 (1.8762)  save_time: 8.8421 (14.7395)  lr: 0.000017  max mem: 26307
2022-03-17 07:40:50,053.053 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6486486196517944
2022-03-17 07:40:50,053.053 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 134.59735107421875
2022-03-17 07:40:50,054.054 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.55902457709784
2022-03-17 07:41:17,684.684 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02321443520486355
2022-03-17 07:41:17,684.684 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:41:17,684.684 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'people', 'standing', 'on', '[MASK]', 'beach', '[MASK]', 'to', 'an', 'orange', 'fence', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:41:17,700.700 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'net', '[UNK]', 'vest', 'person', 'jacket', 'hill', 'shirt', 'snow', 'ground', 'foot', 'woman', 'shadow', 'short', 'ski', 'rock', 'hand', 'leg', 'bush', 'glove', 'head', 'hat', 'pole', 'helmet', 'bag', 'hair', 'group', 'fence', 'girl', 'arm', 'tree', 'grass', 'cap', 'sand', 'bottom', 'coat', 'boot', 'top', 'shoe', 'suit', 'couple', 'beach', 'boy', 'board', 'scarf', 'child', 'rope', 'pipe', 'sky', 'stick']
2022-03-17 07:41:33,622.622 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'life', 'group', 'hand', 'next', 'woman', 'short', 'ground', 'rock', 'board', 'person', 'arm', 'hill', 'couple', 'foot', 'beach', 'shirt', 'leg', 'bag', 'snow', 'orange', 'shadow', 'net', 'bush', 'hat', 'pole', 'jacket', 'ski', 'fence', 'helmet', 'shoe', 'glove', 'vest']
03-17 07:43:22.637 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 07:43:22.637 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 07:43:23.700 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 07:43:57,697.697 2829:trainer.py:487 do_train_dict(): eta: 5:18:05  iter: 55500  speed: 272.3 images/sec  total_norm: 149.0936 (151.6089)  loss: 137.0255 (138.9479)  masked_loss: 1.3884 (1.4175)  tag_loss: 135.6696 (137.5304)  time: 1.4344 (1.8801)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4292 (1.8749)  save_time: 8.8421 (14.7395)  lr: 0.000016  max mem: 26307
2022-03-17 07:43:58,057.057 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6969696879386902
2022-03-17 07:43:58,058.058 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 128.90350341796875
2022-03-17 07:43:58,058.058 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.56633572612735
2022-03-17 07:44:25,772.772 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023209838196635246
2022-03-17 07:44:25,772.772 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:44:25,773.773 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'young', 'kids', 'are', 'playing', '[MASK]', '##is', '[MASK]', 'together', 'outside', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:44:25,788.788 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'hair', 'bush', '[UNK]', 'boy', 'tree', 'sidewalk', 'short', 'flower', 'ground', 'shadow', 'leg', 'jean', 'girl', 'arm', 'shoe', 'gravel', 'hand', 'head', 'roof', 'stick', 'foot', 'young', 'building', 'woman', 'park', 'sleeve', 'child', 'house', 'rock', 'branch', 'leaf', 'sky', 'design', 'little', 'face', 'flip', 'glasses', 'grass', 'car', 'person', 'plant', 'wheel', 'small', 'ball', 'backpack', 'bag', 'flop', 'ear', 'red']
2022-03-17 07:44:41,710.710 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'park', 'young', 'short', 'car', 'ground', 'hair', 'girl', 'arm', 'boy', 'plant', 'tree', 'jean', 'shirt', 'leg', 'shadow', 'bush', 'flower', 'leaf', 'shoe', 'gravel', 'sidewalk', 'stripe']
2022-03-17 07:47:05,536.536 2829:trainer.py:487 do_train_dict(): eta: 5:15:13  iter: 55600  speed: 272.6 images/sec  total_norm: 148.3173 (150.0740)  loss: 134.8123 (137.3642)  masked_loss: 1.4788 (1.4751)  tag_loss: 133.3455 (135.8891)  time: 1.4321 (1.8784)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4268 (1.8732)  save_time: 8.8421 (14.7395)  lr: 0.000016  max mem: 26307
2022-03-17 07:47:05,896.896 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-17 07:47:05,897.897 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 122.99308013916016
2022-03-17 07:47:05,897.897 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.56851624329599
2022-03-17 07:47:33,805.805 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023229800164699554
2022-03-17 07:47:33,805.805 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:47:33,805.805 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'pile', 'of', 'fresh', 'fruits', 'and', 'vegetables', 'on', 'top', '[MASK]', 'a', 'counter', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:47:33,821.821 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['apple', 'stem', '[UNK]', 'banana', 'squash', 'fruit', 'table', 'container', 'pumpkin', 'box', 'vegetable', 'leaf', 'onion', 'orange', 'pear', 'label', 'egg', 'potato', 'bunch', 'plastic', 'floor', 'shadow', 'bag', 'mushroom', 'top', 'other', 'logo', 'different', 'spot', 'green', 'lid', 'ground', 'crate', 'basket', 'bin', 'mango', 'bananas', 'counter', 'writing', 'plant', 'bottle', 'hole', 'next', 'letter', 'food', 'wall', 'variety', 'end', 'fresh', 'paper']
2022-03-17 07:47:49,652.652 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'top', 'table', 'label', 'counter', 'fresh', 'bottle', 'fruit', 'apple', 'package', 'stem', 'pile', 'container', 'banana', 'vegetable', 'mushroom', 'squash', 'onion', 'pumpkin', 'pear']
2022-03-17 07:50:13,270.270 2829:trainer.py:487 do_train_dict(): eta: 5:12:21  iter: 55700  speed: 272.7 images/sec  total_norm: 149.8661 (152.7697)  loss: 138.1550 (139.3135)  masked_loss: 1.3633 (1.3793)  tag_loss: 136.4007 (137.9342)  time: 1.4328 (1.8773)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4277 (1.8721)  save_time: 8.8421 (14.7395)  lr: 0.000016  max mem: 26307
2022-03-17 07:50:13,629.629 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-17 07:50:13,629.629 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 135.34750366210938
2022-03-17 07:50:13,630.630 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.56948369207348
2022-03-17 07:50:41,584.584 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023229990154504776
2022-03-17 07:50:41,585.585 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:50:41,585.585 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'men', 'that', '##heard', 'playing', 'a', 'wii', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:50:41,601.601 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'man', 'wall', 'chair', 'glasses', 'hand', 'jean', 'ceiling', 'controller', 'table', 'couch', 'remote', 'head', 'face', 'game', 'room', 'arm', 'beard', 'door', 'strap', '[UNK]', 'light', 'hat', 'hair', 'short', 'lamp', 'ear', 'video', 'wii', 'boy', 'computer', 'doorway', 'monitor', 'cap', 'pillow', 'jersey', 'switch', 'stripe', 'cord', 'sofa', 'desk', 'speaker', 'living', 'can', 'fan', 'bottle', 'number', 'logo', 'television', 'person']
2022-03-17 07:50:57,526.526 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'game', 'face', 'room', 'short', 'hair', 'table', 'wall', 'arm', 'boy', 'chair', 'paper', 'plant', 'computer', 'jean', 'shirt', 'ear', 'frame', 'mirror', 'ceiling', 'couch', 'remote', 'doorway', 'glasses', 'monitor', 'blanket', 'beard', 'lamp', 'controller', 'strap']
2022-03-17 07:53:21,191.191 2829:trainer.py:487 do_train_dict(): eta: 5:09:28  iter: 55800  speed: 272.5 images/sec  total_norm: 148.3135 (150.1460)  loss: 136.5807 (137.9012)  masked_loss: 1.4268 (1.4673)  tag_loss: 135.1069 (136.4339)  time: 1.4324 (1.8793)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4276 (1.8741)  save_time: 8.8421 (14.7395)  lr: 0.000016  max mem: 26307
2022-03-17 07:53:21,551.551 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-17 07:53:21,551.551 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 127.00611114501953
2022-03-17 07:53:21,551.551 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.58090278884806
2022-03-17 07:53:49,643.643 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02324078604578972
2022-03-17 07:53:49,643.643 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:53:49,643.643 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'people', 'are', 'over', 'by', 'the', 'cows', 'in', 'the', 'water', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:53:49,659.659 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cow', 'water', 'grass', 'animal', 'head', 'boy', 'short', 'bull', 'hill', 'man', 'horn', 'river', 'person', 'hair', 'rock', 'elephant', 'shirt', 'tree', 'ear', 'collar', 'bank', '[UNK]', 'nose', 'dirt', 'trunk', 'ground', 'bird', 'bush', 'field', 'sky', 'rope', 'stick', 'herd', 'cattle', 'tail', 'wall', 'group', 'mouth', 'house', 'dog', 'face', 'building', 'buffalo', 'body', 'child', 'shore', 'neck', 'moss', 'hat', 'ripple']
2022-03-17 07:54:05,640.640 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'water', 'river', 'short', 'field', 'ground', 'rock', 'hair', 'person', 'boy', 'hill', 'neck', 'shirt', 'animal', 'nose', 'grass', 'bull', 'horn', 'collar', 'elephant', 'cow']
2022-03-17 07:56:29,346.346 2829:trainer.py:487 do_train_dict(): eta: 5:06:36  iter: 55900  speed: 272.1 images/sec  total_norm: 148.4833 (151.3380)  loss: 138.5231 (139.6853)  masked_loss: 1.3421 (1.3937)  tag_loss: 137.0009 (138.2916)  time: 1.4329 (1.8816)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4277 (1.8763)  save_time: 8.8421 (14.7395)  lr: 0.000016  max mem: 26307
2022-03-17 07:56:29,706.706 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5277777910232544
2022-03-17 07:56:29,706.706 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 137.6077117919922
2022-03-17 07:56:29,706.706 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.58596892356873
2022-03-17 07:56:57,672.672 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02323666401207447
2022-03-17 07:56:57,672.672 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 07:56:57,673.673 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'little', 'boy', 'is', 'wearing', 'ski', '##s', 'and', '[MASK]', 'large', '[MASK]', 'inside', 'a', 'house', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 07:56:57,688.688 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'helmet', 'hand', 'boy', 'sock', 'wall', 'floor', '[UNK]', 'boot', 'nose', 'face', 'strap', 'door', 'shadow', 'carpet', 'head', 'ski', 'leg', 'eye', 'rug', 'shoe', 'mouth', 'arm', 'child', 'mat', 'stripe', 'design', 'young', 'collar', 'sleeve', 'wheel', 'short', 'person', 'little', 'girl', 'pad', 'logo', 'word', 'board', 'jacket', 'chair', 'ground', 'knee', 'handle', 'ear', 'kid', 'pole', 'building', 'light', 'ball']
2022-03-17 07:57:13,738.738 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'house', 'hand', 'little', 'face', 'large', 'door', 'short', 'mouth', 'floor', 'child', 'wall', 'arm', 'boy', 'eye', 'shirt', 'nose', 'shadow', 'ski', 'boot', 'sleeve', 'helmet', 'mat', 'strap', 'stripe', 'sock']
2022-03-17 07:59:37,470.470 2829:trainer.py:487 do_train_dict(): eta: 5:03:43  iter: 56000  speed: 272.2 images/sec  total_norm: 150.1007 (155.2089)  loss: 137.1774 (137.8814)  masked_loss: 1.4450 (1.4492)  tag_loss: 135.9774 (136.4323)  time: 1.4324 (1.8812)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4273 (1.8761)  save_time: 8.8421 (14.7395)  lr: 0.000016  max mem: 26307
2022-03-17 07:59:37,830.830 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-17 07:59:37,831.831 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 124.09059143066406
2022-03-17 07:59:37,831.831 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.59260741997105
2022-03-17 08:00:05,924.924 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023231040686368942
2022-03-17 08:00:05,925.925 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:00:05,925.925 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'close', '-', 'up', 'of', 'heads', 'of', 'light', 'green', 'bro', '##cco', '##li', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:00:05,940.940 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'stem', 'hole', 'cloth', 'background', 'vegetable', 'head', 'light', 'leaf', 'green', 'water', 'food', 'close', 'flower', 'object', 'piece', 'bud', 'plate', 'white', 'table', 'seed', 'reflection', 'bunch', 'image', 'plant', 'top', 'shadow', 'bowl', 'surface', 'other', 'large', 'name', 'view', 'field', 'next', 'small', 'full', 'back', 'picture', 'bean', 'photo', 'couple', 'napkin', 'line', 'dark', 'item', 'wall', 'end', 'design', 'fish']
2022-03-17 08:00:21,932.932 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'light', 'green', 'hole', 'stem', 'vegetable']
2022-03-17 08:02:45,766.766 2829:trainer.py:487 do_train_dict(): eta: 5:00:51  iter: 56100  speed: 271.9 images/sec  total_norm: 151.2056 (154.6599)  loss: 136.3632 (137.6634)  masked_loss: 1.4114 (1.4704)  tag_loss: 135.1435 (136.1930)  time: 1.4322 (1.8830)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4270 (1.8778)  save_time: 8.8421 (14.7395)  lr: 0.000016  max mem: 26307
2022-03-17 08:02:46,128.128 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5714285969734192
2022-03-17 08:02:46,128.128 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 156.21681213378906
2022-03-17 08:02:46,129.129 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.59144246790332
2022-03-17 08:03:14,426.426 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023218417540192604
2022-03-17 08:03:14,426.426 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:03:14,427.427 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'couple', 'of', 'zebra', 'standing', 'next', 'to', 'each', 'other', 'on', 'a', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:03:14,442.442 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['zebra', 'tree', 'sky', 'grass', 'leg', 'field', 'cloud', 'ear', 'tail', 'ground', 'mane', 'head', 'bush', 'dirt', 'fence', 'stripe', 'rock', 'animal', 'group', 'background', 'puddle', 'cow', '[UNK]', 'grassy', 'road', 'couple', 'herd', 'wood', 'other', 'patch', 'next', 'pole', 'area', 'stick', 'grazing', 'open', 'mud', 'green', 'wall', 'nose', 'path', 'sign', 'bird', 'person', 'car', 'line', 'building', 'water', 'horn', 'side']
2022-03-17 08:03:30,372.372 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['other', 'next', 'road', 'car', 'field', 'ground', 'date', 'couple', 'tree', 'wood', 'sky', 'leg', 'ear', 'grass', 'tail', 'bush', 'cloud', 'suv', 'mane', 'zebra']
2022-03-17 08:05:53,855.855 2829:trainer.py:487 do_train_dict(): eta: 4:57:58  iter: 56200  speed: 272.2 images/sec  total_norm: 149.7986 (152.3104)  loss: 137.5797 (138.6150)  masked_loss: 1.3879 (1.4413)  tag_loss: 135.9902 (137.1737)  time: 1.4339 (1.8808)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4285 (1.8756)  save_time: 8.8421 (14.7395)  lr: 0.000015  max mem: 26307
2022-03-17 08:05:54,215.215 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-17 08:05:54,215.215 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 140.3431396484375
2022-03-17 08:05:54,215.215 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.59326526242081
2022-03-17 08:06:22,283.283 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023203222081065178
2022-03-17 08:06:22,283.283 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:06:22,283.283 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'table', 'has', '[MASK]', '[MASK]', "'", 's', 'items', 'on', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:06:22,299.299 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'logo', 'wall', 'toy', 'hat', 'picture', 'box', 'man', 'book', 'table', 'window', 'car', 'sign', 'shirt', 'doll', 'helmet', 'tire', 'bear', 'floor', 'leg', 'light', 'person', 'poster', 'hair', 'bag', 'shelf', 'head', 'truck', 'door', 'figure', 'clothes', 'jacket', 'woman', 'wheel', 'ground', 'chair', 'suit', 'tree', 'store', 'hand', 'display', 'boot', 'uniform', 'desk', 'coat', 'pole', 'shoe', 'building', 'board', 'arm']
2022-03-17 08:06:38,206.206 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'book', 'board', 'table', 'wall', 'magazine', 'figure', 'box', 'block', 'picture', 'boat', 'bottle', 'hat', 'logo', 'toy', 'candy', 'miscellaneous']
2022-03-17 08:09:02,346.346 2829:trainer.py:487 do_train_dict(): eta: 4:55:05  iter: 56300  speed: 271.6 images/sec  total_norm: 149.8941 (152.4911)  loss: 139.7734 (140.5896)  masked_loss: 1.4047 (1.4255)  tag_loss: 137.9306 (139.1640)  time: 1.4341 (1.8849)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4290 (1.8798)  save_time: 8.8421 (14.7395)  lr: 0.000015  max mem: 26307
2022-03-17 08:09:02,707.707 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-17 08:09:02,707.707 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 151.23934936523438
2022-03-17 08:09:02,707.707 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.5877472559611
2022-03-17 08:09:31,205.205 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023203494027256966
2022-03-17 08:09:31,206.206 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:09:31,206.206 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'piece', 'of', 'bro', '##cco', '##li', 'on', '[MASK]', 'metal', 'fork', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:09:31,222.222 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['fork', '[UNK]', 'stem', 'table', 'background', 'plate', 'wall', 'close', 'food', 'apple', 'bowl', 'piece', 'spot', 'head', 'green', 'leaf', 'skin', 'object', 'handle', 'ring', 'banana', 'onion', 'fruit', 'surface', 'small', 'hole', 'water', 'rim', 'orange', 'vegetable', 'flower', 'white', 'yellow', 'ground', 'shadow', 'end', 'peel', 'line', 'other', 'top', 'band', 'next', 'side', 'reflection', 'image', 'plant', 'light', 'picture', 'body', 'full']
2022-03-17 08:09:47,120.120 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'table', 'food', 'metal', 'piece', 'plate', 'flower', 'stem', 'fork', 'banana']
2022-03-17 08:12:10,910.910 2829:trainer.py:487 do_train_dict(): eta: 4:52:13  iter: 56400  speed: 271.5 images/sec  total_norm: 149.1908 (151.0065)  loss: 141.2649 (141.3576)  masked_loss: 1.4030 (1.4457)  tag_loss: 139.2193 (139.9120)  time: 1.4326 (1.8857)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4275 (1.8806)  save_time: 8.8421 (14.7395)  lr: 0.000015  max mem: 26307
2022-03-17 08:12:11,271.271 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.47058823704719543
2022-03-17 08:12:11,271.271 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 152.55862426757812
2022-03-17 08:12:11,272.272 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.58925298505125
2022-03-17 08:12:39,672.672 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023209817707538605
2022-03-17 08:12:39,672.672 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:12:39,672.672 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'in', 'a', 'red', 'tie', 'and', '[MASK]', 'is', 'holding', 'a', 'large', 'fish', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:12:39,688.688 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['eye', 'tie', 'ear', 'nose', 'head', 'face', 'shirt', 'man', 'jacket', 'hair', 'mouth', 'neck', 'collar', 'tree', 'smile', 'chin', 'knot', 'hand', 'coat', 'teeth', 'arm', 'eyebrow', '[UNK]', 'background', 'finger', 'shoulder', 'sky', 'grass', 'suit', 'button', 'car', 'young', 'person', 'hat', 'boy', 'forehead', 'bush', 'hood', 'strap', 'lip', 'sleeve', 'field', 'camera', 'ground', 'ring', 'woman', 'wall', 'watch', 'fence', 'blue']
2022-03-17 08:12:55,702.702 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'face', 'water', 'large', 'red', 'mouth', 'smile', 'eye', 'neck', 'tree', 'sky', 'shirt', 'fish', 'animal', 'background', 'finger', 'nose', 'ear', 'suit', 'chin', 'tie', 'hat', 'cap', 'jacket', 'collar']
03-17 08:13:23.725 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 08:13:23.725 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 08:13:24.987 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 08:15:19,303.303 2829:trainer.py:487 do_train_dict(): eta: 4:49:20  iter: 56500  speed: 271.8 images/sec  total_norm: 148.4305 (149.5362)  loss: 138.8120 (138.3451)  masked_loss: 1.4183 (1.4488)  tag_loss: 137.2401 (136.8963)  time: 1.4319 (1.8839)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0051)  time_gpu: 1.4266 (1.8783)  save_time: 8.8421 (14.7395)  lr: 0.000015  max mem: 26307
2022-03-17 08:15:19,664.664 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6388888955116272
2022-03-17 08:15:19,664.664 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 150.31787109375
2022-03-17 08:15:19,664.664 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.59516465284798
2022-03-17 08:15:48,079.079 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023205023258924484
2022-03-17 08:15:48,079.079 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:15:48,079.079 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'are', 'many', 'pedestrians', 'and', 'cyclists', 'along', 'this', 'small', 'street', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:15:48,095.095 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bike', '[UNK]', 'bicycle', 'shoe', 'man', 'line', 'jacket', 'street', 'road', 'person', 'glove', 'head', 'building', 'sidewalk', 'light', 'woman', 'hat', 'wheel', 'tire', 'sign', 'helmet', 'bag', 'hand', 'coat', 'shirt', 'car', 'leg', 'face', 'motorcycle', 'window', 'curb', 'jean', 'hair', 'pole', 'background', 'license', 'backpack', 'tree', 'umbrella', 'traffic', 'sky', 'boot', 'bus', 'van', 'vehicle', 'arm', 'glasses', 'basket', 'city', 'foot']
2022-03-17 08:16:04,034.034 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'many', 'head', 'man', 'hand', 'small', 'line', 'building', 'road', 'street', 'woman', 'hair', 'person', 'phone', 'cell', 'sign', 'leg', 'bag', 'wheel', 'coat', 'jacket', 'bike', 'bicycle', 'shoe', 'sidewalk', 'tire', 'curb', 'glove']
2022-03-17 08:18:27,874.874 2829:trainer.py:487 do_train_dict(): eta: 4:46:27  iter: 56600  speed: 271.5 images/sec  total_norm: 147.9287 (151.0743)  loss: 137.4375 (137.9743)  masked_loss: 1.3778 (1.3839)  tag_loss: 135.7999 (136.5904)  time: 1.4342 (1.8858)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4291 (1.8806)  save_time: 8.8421 (14.7395)  lr: 0.000015  max mem: 26307
2022-03-17 08:18:28,233.233 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-17 08:18:28,233.233 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 133.1803436279297
2022-03-17 08:18:28,233.233 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.60285644934922
2022-03-17 08:18:56,441.441 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023262659087777138
2022-03-17 08:18:56,441.441 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:18:56,441.441 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'person', '[MASK]', 'a', 'skate', '##board', 'on', 'a', 'street', 'doing', 'a', 'trick', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:18:56,457.457 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'helmet', '[UNK]', 'bush', 'arm', 'ground', 'head', 'hand', 'boy', 'leg', 'shoe', 'man', 'wheel', 'tree', 'face', 'pad', 'knee', 'grass', 'wall', 'glove', 'sock', 'background', 'pole', 'field', 'short', 'young', 'fence', 'park', 'foot', 'person', 'belt', 'jean', 'ear', 'watch', 'nose', 'building', 'ball', 'bracelet', 'dirt', 'line', 'hat', 'hair', 'door', 'wrist', 'stripe', 'sleeve', 'elbow', 'uniform', 'band', 'road']
2022-03-17 08:19:12,369.369 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'door', 'road', 'street', 'light', 'car', 'ground', 'hair', 'person', 'arm', 'boy', 'plant', 'tree', 'watch', 'sky', 'shirt', 'nose', 'wheel', 'grass', 'bush', 'pole', 'wrist', 'trick', 'fence', 'helmet', 'shoe', 'sidewalk', 'curb', 'sweater', 'glove']
2022-03-17 08:21:36,293.293 2829:trainer.py:487 do_train_dict(): eta: 4:43:35  iter: 56700  speed: 271.7 images/sec  total_norm: 149.2894 (150.7991)  loss: 140.6784 (141.9022)  masked_loss: 1.3397 (1.3747)  tag_loss: 139.3652 (140.5275)  time: 1.4313 (1.8841)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4262 (1.8789)  save_time: 8.8421 (14.7395)  lr: 0.000015  max mem: 26307
2022-03-17 08:21:36,654.654 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6969696879386902
2022-03-17 08:21:36,654.654 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 112.25345611572266
2022-03-17 08:21:36,654.654 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.60558924204867
2022-03-17 08:22:05,047.047 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023266250267624855
2022-03-17 08:22:05,048.048 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:22:05,048.048 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'white', 'detailed', 'kitchen', '[MASK]', 'shown', 'with', 'wood', 'floor', '##ing', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:22:05,063.063 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cabinet', 'door', 'floor', 'stove', '[UNK]', 'wall', 'kitchen', 'oven', 'refrigerator', 'handle', 'top', 'drawer', 'outlet', 'knob', 'white', 'window', 'black', 'sink', 'ceiling', 'tile', 'light', 'empty', 'room', 'fan', 'large', 'shelf', 'switch', 'microwave', 'wood', 'rack', 'bag', 'clean', 'open', 'small', 'wooden', 'cord', 'logo', 'new', 'cupboard', 'paper', 'old', 'leg', 'brown', 'range', 'counter', 'next', 'area', 'silver', 'hood', 'modern']
2022-03-17 08:22:20,979.979 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'white', 'top', 'door', 'floor', 'wall', 'wood', 'kitchen', 'handle', 'cabinet', 'detailed', 'sink', 'drawer', 'outlet', 'stove', 'oven', 'refrigerator']
2022-03-17 08:24:44,865.865 2829:trainer.py:487 do_train_dict(): eta: 4:40:42  iter: 56800  speed: 271.5 images/sec  total_norm: 148.0666 (149.1261)  loss: 136.3703 (136.7830)  masked_loss: 1.4991 (1.4966)  tag_loss: 134.9193 (135.2864)  time: 1.4320 (1.8858)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4269 (1.8806)  save_time: 8.8421 (14.7395)  lr: 0.000014  max mem: 26307
2022-03-17 08:24:45,225.225 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-17 08:24:45,225.225 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 107.49156188964844
2022-03-17 08:24:45,226.226 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.61769520335541
2022-03-17 08:25:13,688.688 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023308832198381424
2022-03-17 08:25:13,689.689 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:25:13,689.689 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'woman', '[MASK]', 'in', 'a', 'kitchen', 'holding', 'a', '[MASK]', '##brush', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:25:13,705.705 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'cabinet', 'kitchen', 'bowl', 'woman', 'curtain', 'hand', 'hair', 'glass', 'apron', 'shirt', 'bottle', 'table', 'window', 'wall', 'dress', 'container', 'shelf', 'pot', 'head', 'drawer', 'towel', 'cup', 'food', 'ear', 'girl', 'door', 'spoon', 'eye', 'refrigerator', 'face', 'sink', 'man', 'paper', 'nose', 'knife', 'person', 'plate', 'outlet', 'napkin', 'stove', 'handle', 'basket', 'top', 'counter', 'box', 'knob', 'fish', 'vase', 'arm']
2022-03-17 08:25:29,578.578 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'book', 'woman', 'hair', 'wall', 'glass', 'paper', 'window', 'shirt', 'kitchen', 'fish', 'dress', 'ear', 'bowl', 'cabinet', 'knife', 'bottle', 'sink', 'pot', 'towel', 'basket', 'curtain', 'stove', 'microwave', 'apron']
2022-03-17 08:27:53,510.510 2829:trainer.py:487 do_train_dict(): eta: 4:37:49  iter: 56900  speed: 271.4 images/sec  total_norm: 149.3862 (152.0056)  loss: 140.9772 (139.8971)  masked_loss: 1.4529 (1.4219)  tag_loss: 139.6408 (138.4752)  time: 1.4308 (1.8865)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4257 (1.8813)  save_time: 8.8421 (14.7395)  lr: 0.000014  max mem: 26307
2022-03-17 08:27:53,871.871 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-17 08:27:53,872.872 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 133.73890686035156
2022-03-17 08:27:53,872.872 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.62367358960603
2022-03-17 08:28:22,705.705 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023288562893867493
2022-03-17 08:28:22,705.705 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:28:22,706.706 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'beige', 'and', 'red', 'and', 'a', 'blue', 'and', '[MASK]', 'and', 'white', 'train', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:28:22,721.721 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'sky', 'train', 'tree', 'track', 'pole', 'car', 'building', 'roof', 'tower', 'station', 'platform', 'door', 'sidewalk', 'ground', 'line', 'stripe', 'wire', 'red', 'bridge', 'sign', 'water', 'wheel', 'passenger', 'flag', 'fence', 'light', '[UNK]', 'wall', 'long', 'street', 'grass', 'next', 'person', 'white', 'logo', 'top', 'front', 'power', 'bench', 'gravel', 'chimney', 'bush', 'road', 'pavement', 'commuter', 'pillar', 'stop', 'railroad', 'other']
2022-03-17 08:28:38,739.739 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['station', 'building', 'white', 'door', 'red', 'light', 'car', 'blue', 'track', 'green', 'window', 'train', 'tree', 'tower', 'sky', 'platform', 'roof', 'pole', 'bike', 'fence', 'stripe', 'beige']
2022-03-17 08:31:02,075.075 2829:trainer.py:487 do_train_dict(): eta: 4:34:56  iter: 57000  speed: 271.5 images/sec  total_norm: 146.9477 (150.6722)  loss: 133.0166 (135.5403)  masked_loss: 1.3881 (1.4403)  tag_loss: 131.9956 (134.1000)  time: 1.4313 (1.8856)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4262 (1.8805)  save_time: 8.8421 (14.7395)  lr: 0.000014  max mem: 26307
2022-03-17 08:31:02,437.437 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-17 08:31:02,437.437 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 182.251953125
2022-03-17 08:31:02,438.438 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.62265684725733
2022-03-17 08:31:31,150.150 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023288192227482796
2022-03-17 08:31:31,150.150 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:31:31,151.151 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'standing', 'on', 'a', '[MASK]', '[MASK]', 'holding', 'a', 'tennis', 'ra', '##c', '##quet', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:31:31,166.166 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'line', 'hand', '[UNK]', 'wall', 'shoe', 'man', 'tennis', 'court', 'arm', 'hair', 'leg', 'fence', 'jacket', 'building', 'shadow', 'head', 'ball', 'mouth', 'tree', 'handle', 'ground', 'face', 'pole', 'zipper', 'string', 'palm', 'wire', 'person', 'air', 'light', 'jean', 'stripe', 'logo', 'background', 'clothes', 'window', 'sign', 'shirt', 'house', 'writing', 'roof', 'street', 'mountain', 'guy', 'player', 'hill', 'cloud', 'net', 'antenna']
2022-03-17 08:31:47,133.133 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'building', 'court', 'ground', 'hair', 'mouth', 'wall', 'arm', 'ball', 'sky', 'leg', 'handle', 'tennis', 'string', 'shadow', 'jacket', 'wire', 'fence', 'shoe']
2022-03-17 08:34:10,873.873 2829:trainer.py:487 do_train_dict(): eta: 4:32:03  iter: 57100  speed: 271.2 images/sec  total_norm: 148.8043 (150.7363)  loss: 136.7576 (137.6475)  masked_loss: 1.4386 (1.4457)  tag_loss: 135.4401 (136.2019)  time: 1.4320 (1.8880)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4267 (1.8828)  save_time: 8.8421 (14.7395)  lr: 0.000014  max mem: 26307
2022-03-17 08:34:11,235.235 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-17 08:34:11,235.235 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.95419311523438
2022-03-17 08:34:11,236.236 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.62324387210232
2022-03-17 08:34:40,029.029 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02328598126769066
2022-03-17 08:34:40,029.029 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:34:40,030.030 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'three', 'kids', 'sitting', 'on', 'a', 'couch', 'playing', 'a', '[MASK]', 'game', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:34:40,045.045 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'pillow', 'boy', 'head', 'shirt', 'hand', 'couch', 'eye', '[UNK]', 'face', 'remote', 'ear', 'cushion', 'wall', 'nose', 'jean', 'arm', 'table', 'smile', 'leg', 'controller', 'control', 'chair', 'mouth', 'sweater', 'book', 'young', 'blanket', 'window', 'woman', 'laptop', 'paper', 'room', 'phone', 'dog', 'red', 'game', 'foot', 'cord', 'floor', 'kid', 'video', 'man', 'sleeve', 'cat', 'sofa', 'curtain', 'logo', 'picture', 'sock']
2022-03-17 08:34:55,914.914 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'game', 'woman', 'hair', 'girl', 'video', 'person', 'wall', 'boy', 'phone', 'jean', 'shirt', 'ear', 'hole', 'couch', 'pole', 'remote', 'glasses', 'logo', 'blanket', 'pillow', 'rug']
2022-03-17 08:37:19,555.555 2829:trainer.py:487 do_train_dict(): eta: 4:29:10  iter: 57200  speed: 271.4 images/sec  total_norm: 148.8788 (151.3214)  loss: 137.1562 (137.7977)  masked_loss: 1.4398 (1.4618)  tag_loss: 135.5013 (136.3359)  time: 1.4313 (1.8868)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4262 (1.8817)  save_time: 8.8421 (14.7395)  lr: 0.000014  max mem: 26307
2022-03-17 08:37:19,915.915 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-17 08:37:19,915.915 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 117.8505859375
2022-03-17 08:37:19,915.915 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.6289490105594
2022-03-17 08:37:48,415.415 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023275790736079216
2022-03-17 08:37:48,415.415 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:37:48,416.416 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'box', 'of', 'six', 'don', '[MASK]', 'with', '[MASK]', '##eb', '##ora', '##te', 'decorations', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:37:48,431.431 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'box', 'hole', 'line', 'chocolate', 'table', 'different', 'paper', 'dozen', 'lid', 'food', 'pastry', 'top', 'light', 'cheese', 'cream', 'ball', 'wall', 'cake', 'candy', 'bunch', 'container', 'tile', 'yellow', 'open', 'sugar', 'variety', 'potato', 'full', 'group', 'reflection', 'dessert', 'cardboard', 'design', 'piece', 'bowl', 'stripe', 'orange', 'white', 'other', 'various', 'tray', 'large', 'several', 'kind', 'half', 'twelve', 'butter', 'glazed', 'close']
2022-03-17 08:38:04,348.348 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'line', 'paper', 'box', 'hole']
2022-03-17 08:40:28,435.435 2829:trainer.py:487 do_train_dict(): eta: 4:26:17  iter: 57300  speed: 271.1 images/sec  total_norm: 148.6674 (151.1573)  loss: 140.6454 (140.7898)  masked_loss: 1.4309 (1.4318)  tag_loss: 139.1984 (139.3580)  time: 1.4306 (1.8888)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4256 (1.8838)  save_time: 8.8421 (14.7395)  lr: 0.000014  max mem: 26307
2022-03-17 08:40:28,796.796 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-17 08:40:28,797.797 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.8095703125
2022-03-17 08:40:28,797.797 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.6352293100922
2022-03-17 08:40:57,641.641 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023335572332143784
2022-03-17 08:40:57,641.641 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:40:57,641.641 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'polar', 'bear', 'under', 'water', '[MASK]', 'with', 'a', '[MASK]', 'ball', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:40:57,657.657 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bear', 'polar', 'head', 'paw', 'nose', 'eye', 'water', 'ball', 'ear', 'leg', 'rock', 'claw', 'fur', 'egg', 'foot', 'mouth', 'white', 'face', 'ice', 'reflection', '[UNK]', 'snout', 'large', 'leaf', 'bubble', 'snow', 'ledge', 'animal', 'wall', 'ground', 'back', 'tail', 'fish', 'blue', 'food', 'small', 'light', 'bowl', 'underwater', 'object', 'hole', 'grass', 'branch', 'pool', 'container', 'plant', 'other', 'handle', 'close', 'brown']
2022-03-17 08:41:13,570.570 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'water', 'white', 'playing', 'mouth', 'eye', 'ice', 'foot', 'ball', 'leg', 'nose', 'ear', 'bear', 'polar', 'paw']
03-17 08:43:25.088 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 08:43:25.088 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 08:43:26.191 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 08:43:37,096.096 2829:trainer.py:487 do_train_dict(): eta: 4:23:23  iter: 57400  speed: 271.4 images/sec  total_norm: 150.4902 (155.1706)  loss: 139.3521 (139.5555)  masked_loss: 1.3812 (1.4023)  tag_loss: 138.2511 (138.1532)  time: 1.4316 (1.8866)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4264 (1.8814)  save_time: 8.8421 (14.7395)  lr: 0.000014  max mem: 26307
2022-03-17 08:43:37,458.458 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6875
2022-03-17 08:43:37,459.459 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 122.96945190429688
2022-03-17 08:43:37,459.459 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.64571930595066
2022-03-17 08:44:06,099.099 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023359859362244606
2022-03-17 08:44:06,099.099 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:44:06,100.100 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'living', 'room', 'with', 'a', '[MASK]', 'sectional', 'sofa', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:44:06,115.115 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['floor', 'wall', 'room', 'couch', 'stair', 'ceiling', 'railing', 'television', 'ottoman', 'staircase', 'door', 'pillow', 'bike', 'bicycle', 'blanket', 'living', 'light', 'table', 'book', 'sofa', 'flower', 'tire', 'wheel', 'stand', 'chair', 'basket', 'bag', '[UNK]', 'step', 'decoration', 'stairway', 'refrigerator', 'magazine', 'lamp', 'cabinet', 'arm', 'cushion', 'map', 'microwave', 'doorway', 'rail', 'fan', 'apartment', 'house', 'kitchen', 'box', 'top', 'vase', 'rug', 'white']
2022-03-17 08:44:22,087.087 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['long', 'room', 'book', 'door', 'light', 'living', 'television', 'floor', 'table', 'wall', 'magazine', 'stand', 'window', 'step', 'box', 'coffee', 'ceiling', 'ottoman', 'couch', 'flower', 'bike', 'blanket', 'pillow', 'bicycle', 'lamp', 'sofa', 'staircase', 'curtain', 'railing', 'vase', 'stairway', 'stair', 'bouquet', 'sectional']
2022-03-17 08:46:46,064.064 2829:trainer.py:487 do_train_dict(): eta: 4:20:30  iter: 57500  speed: 270.9 images/sec  total_norm: 147.8803 (153.1434)  loss: 137.8856 (139.5840)  masked_loss: 1.4182 (1.4364)  tag_loss: 136.5273 (138.1477)  time: 1.4342 (1.8896)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4289 (1.8844)  save_time: 8.8421 (14.7395)  lr: 0.000013  max mem: 26307
2022-03-17 08:46:46,424.424 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-17 08:46:46,425.425 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 166.89979553222656
2022-03-17 08:46:46,425.425 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.64459221230612
2022-03-17 08:47:14,986.986 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023396525532007217
2022-03-17 08:47:14,986.986 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:47:14,987.987 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'riding', 'a', '[MASK]', 'board', 'at', 'a', 'skate', 'park', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:47:15,002.002 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'wall', 'ramp', 'shirt', 'man', 'wheel', 'hand', 'head', 'arm', 'shoe', 'leg', 'shadow', 'floor', 'boy', 'short', 'light', 'skate', 'board', 'tile', 'hat', 'ceiling', 'hair', 'person', 'helmet', 'jean', 'building', 'graffiti', 'park', 'wire', 'pad', 'logo', 'ground', 'pole', 'air', 'sign', 'sky', 'foot', 'trick', 'knee', 'bowl', 'tree', 'door', 'picture', 'face', 'cap', 'fence', 'wood', 'line', 'box', 'sock']
2022-03-17 08:47:30,945.945 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'number', 'face', 'park', 'short', 'board', 'hair', 'person', 'floor', 'wall', 'arm', 'guitar', 'phone', 'shirt', 'leg', 'drawing', 'fence', 'shoe', 'ramp', 'tile', 'skate', 'graffiti']
2022-03-17 08:49:54,751.751 2829:trainer.py:487 do_train_dict(): eta: 4:17:37  iter: 57600  speed: 271.4 images/sec  total_norm: 147.6853 (149.3448)  loss: 136.7640 (140.0522)  masked_loss: 1.3705 (1.4063)  tag_loss: 135.5707 (138.6459)  time: 1.4321 (1.8869)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4270 (1.8814)  save_time: 8.8421 (14.7395)  lr: 0.000013  max mem: 26307
2022-03-17 08:49:55,111.111 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6571428775787354
2022-03-17 08:49:55,111.111 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 131.82594299316406
2022-03-17 08:49:55,112.112 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.65356919339872
2022-03-17 08:50:24,002.002 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023407846689224243
2022-03-17 08:50:24,002.002 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:50:24,002.002 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bunch', 'of', '[MASK]', 'kind', 'of', 'surf', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:50:24,018.018 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sign', 'wall', 'ground', 'board', '[UNK]', 'pole', 'rope', 'post', 'base', 'stand', 'floor', 'shelf', 'rack', 'cord', 'grass', 'hook', 'display', 'box', 'chain', 'tag', 'door', 'leg', 'next', 'window', 'platform', 'wire', 'sky', 'tent', 'other', 'line', 'writing', 'ladder', 'orange', 'tree', 'banner', 'basket', 'art', 'table', 'sale', 'row', 'leaf', 'fence', 'object', 'group', 'shoe', 'white', 'net', 'beam', 'bunch', 'dirt']
2022-03-17 08:50:39,919.919 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'different', 'ground', 'board', 'post', 'kind', 'wall', 'base', 'letter', 'sign', 'pole', 'beam', 'rope', 'bunch', 'banner', 'shelf', 'cord', 'bucket', 'surf', 'rack']
2022-03-17 08:53:03,706.706 2829:trainer.py:487 do_train_dict(): eta: 4:14:44  iter: 57700  speed: 271.0 images/sec  total_norm: 148.0219 (151.7331)  loss: 138.9523 (140.3066)  masked_loss: 1.4463 (1.4518)  tag_loss: 137.7264 (138.8549)  time: 1.4325 (1.8895)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4275 (1.8844)  save_time: 8.8421 (14.7395)  lr: 0.000013  max mem: 26307
2022-03-17 08:53:04,067.067 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6486486196517944
2022-03-17 08:53:04,067.067 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 150.50381469726562
2022-03-17 08:53:04,067.067 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.6492380254409
2022-03-17 08:53:32,915.915 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0234339889138937
2022-03-17 08:53:32,916.916 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:53:32,916.916 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'long', 'dining', 'room', 'table', 'filled', 'with', 'people', 'in', 'dress', 'clothing', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:53:32,933.933 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['glass', 'table', 'man', 'shirt', 'person', 'hair', 'paper', 'ceiling', 'glasses', 'wall', 'light', 'head', 'bar', 'wine', '[UNK]', 'menu', 'restaurant', 'plate', 'woman', 'room', 'bottle', 'picture', 'hand', 'chair', 'napkin', 'window', 'group', 'door', 'lamp', 'cup', 'hat', 'mirror', 'jacket', 'sign', 'face', 'camera', 'long', 'ear', 'large', 'column', 'doorway', 'speaker', 'book', 'frame', 'water', 'phone', 'pitcher', 'beam', 'vent', 'shelf']
2022-03-17 08:53:48,781.781 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'room', 'door', 'light', 'woman', 'hair', 'person', 'table', 'wall', 'glass', 'paper', 'bar', 'sign', 'shirt', 'picture', 'dress', 'bowl', 'handle', 'plate', 'bottle', 'ceiling', 'clothing', 'glasses', 'pitcher', 'lamp', 'menu', 'candle', 'lemon', 'napkin', 'receipt']
2022-03-17 08:56:12,573.573 2829:trainer.py:487 do_train_dict(): eta: 4:11:50  iter: 57800  speed: 271.1 images/sec  total_norm: 148.7063 (150.6022)  loss: 135.1004 (137.7678)  masked_loss: 1.3843 (1.4452)  tag_loss: 133.6597 (136.3226)  time: 1.4319 (1.8887)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4270 (1.8835)  save_time: 8.8421 (14.7395)  lr: 0.000013  max mem: 26307
2022-03-17 08:56:12,934.934 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6969696879386902
2022-03-17 08:56:12,934.934 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 118.05091094970703
2022-03-17 08:56:12,934.934 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.65965647195081
2022-03-17 08:56:41,607.607 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02344072423875332
2022-03-17 08:56:41,607.607 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:56:41,608.608 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'blue', '[MASK]', 'sitting', 'next', 'to', 'a', 'green', 'tree', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:56:41,623.623 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['vase', 'wall', 'floor', 'flower', 'plant', 'table', 'base', 'stand', 'leaf', 'shadow', '[UNK]', 'paper', 'book', 'blue', 'design', 'tree', 'pot', 'display', 'star', 'top', 'box', 'reflection', 'room', 'picture', 'next', 'shelf', 'frame', 'light', 'outlet', 'white', 'platform', 'large', 'window', 'leg', 'pedestal', 'front', 'wooden', 'tile', 'handle', 'small', 'colorful', 'painting', 'rug', 'man', 'head', 'corner', 'chair', 'green', 'hair', 'wood']
2022-03-17 08:56:57,585.585 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['next', 'top', 'book', 'blue', 'green', 'floor', 'table', 'wall', 'stand', 'paper', 'plant', 'tree', 'box', 'wood', 'tall', 'flower', 'leaf', 'cloth', 'pot', 'vase']
2022-03-17 08:59:21,647.647 2829:trainer.py:487 do_train_dict(): eta: 4:08:57  iter: 57900  speed: 270.8 images/sec  total_norm: 149.0412 (150.6454)  loss: 139.1870 (139.5264)  masked_loss: 1.3890 (1.4332)  tag_loss: 137.9378 (138.0931)  time: 1.4333 (1.8907)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4281 (1.8855)  save_time: 8.8421 (14.7395)  lr: 0.000013  max mem: 26307
2022-03-17 08:59:22,008.008 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-17 08:59:22,009.009 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 114.33843231201172
2022-03-17 08:59:22,009.009 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.67407786270668
2022-03-17 08:59:50,967.967 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023425210267305374
2022-03-17 08:59:50,968.968 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 08:59:50,968.968 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'swinging', 'a', 'baseball', '[MASK]', '[MASK]', 'standing', 'on', 'a', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 08:59:50,983.983 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'helmet', 'grass', 'glove', 'jersey', 'tree', 'field', 'dirt', 'man', 'arm', 'baseball', '[UNK]', 'bat', 'sky', 'name', 'building', 'player', 'number', 'ball', 'uniform', 'head', 'fence', 'background', 'belt', 'catcher', 'hand', 'cloud', 'shoe', 'back', 'pole', 'ground', 'leg', 'person', 'hat', 'cap', 'roof', 'batter', 'game', 'home', 'handle', 'umpire', 'boy', 'pitcher', 'logo', 'stripe', 'red', 'short', 'base', 'band', 'house']
2022-03-17 09:00:06,984.984 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'name', 'number', 'building', 'player', 'field', 'ground', 'person', 'arm', 'tree', 'baseball', 'ball', 'sky', 'shirt', 'jersey', 'grass', 'hat', 'cap', 'uniform', 'pole', 'dirt', 'bat', 'fence', 'helmet', 'glove']
2022-03-17 09:02:30,685.685 2829:trainer.py:487 do_train_dict(): eta: 4:06:03  iter: 58000  speed: 270.8 images/sec  total_norm: 148.7249 (150.2698)  loss: 140.1587 (140.1680)  masked_loss: 1.4372 (1.4344)  tag_loss: 138.7953 (138.7336)  time: 1.4332 (1.8904)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4279 (1.8852)  save_time: 8.8421 (14.7395)  lr: 0.000013  max mem: 26307
2022-03-17 09:02:31,047.047 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7428571581840515
2022-03-17 09:02:31,047.047 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 118.852783203125
2022-03-17 09:02:31,047.047 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.68301808526307
2022-03-17 09:03:00,365.365 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023407941684126854
2022-03-17 09:03:00,365.365 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:03:00,366.366 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'gi', '##rra', '##ffe', 'standing', '[MASK]', 'to', 'some', 'rocks', 'and', 'trees', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:03:00,381.381 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['head', 'ear', '[UNK]', 'rock', 'horn', 'eye', 'neck', 'spot', 'mane', 'zoo', 'tree', 'mouth', 'bush', 'nose', 'shadow', 'wall', 'branch', 'face', 'ground', 'plant', 'boulder', 'next', 'leg', 'grass', 'hair', 'trunk', 'tongue', 'leaf', 'stone', 'pole', 'standing', 'chin', 'enclosure', 'arm', 'weed', 'young', 'body', 'fence', 'vine', 'dirt', 'hat', 'front', 'other', 'small', 'top', 'ivy', 'animal', 'post', 'large', 'tall']
2022-03-17 09:03:16,253.253 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'next', 'rock', 'mouth', 'eye', 'plant', 'neck', 'tree', 'spot', 'nose', 'ear', 'shadow', 'bush', 'flower', 'leaf', 'horn', 'ivy', 'zoo', 'mane']
2022-03-17 09:05:39,692.692 2829:trainer.py:487 do_train_dict(): eta: 4:03:10  iter: 58100  speed: 270.9 images/sec  total_norm: 149.6495 (152.0553)  loss: 140.6216 (139.8002)  masked_loss: 1.3836 (1.4795)  tag_loss: 138.8303 (138.3206)  time: 1.4317 (1.8900)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4266 (1.8848)  save_time: 8.8421 (14.7395)  lr: 0.000013  max mem: 26307
2022-03-17 09:05:40,053.053 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6875
2022-03-17 09:05:40,053.053 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 123.70963287353516
2022-03-17 09:05:40,053.053 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.68773946729313
2022-03-17 09:06:10,752.752 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023405494168400764
2022-03-17 09:06:10,753.753 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:06:10,753.753 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'at', 'an', '[MASK]', 'market', 'standing', 'behind', 'boxes', '[MASK]', 'bananas', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:06:10,769.769 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['person', 'box', 'umbrella', 'man', 'building', 'hat', 'banana', 'shirt', 'sky', 'woman', 'table', '[UNK]', 'city', 'pole', 'crate', 'head', 'boot', 'ground', 'shoe', 'bag', 'jacket', 'floor', 'cap', 'stand', 'hair', 'jean', 'cart', 'light', 'hand', 'group', 'market', 'backpack', 'tree', 'wheel', 'background', 'sign', 'leg', 'bunch', 'street', 'short', 'yellow', 'glasses', 'chair', 'large', 'bottle', 'boy', 'sweater', 'sunglasses', 'sidewalk', 'road']
2022-03-17 09:06:26,657.657 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'head', 'man', 'building', 'woman', 'ground', 'hair', 'person', 'table', 'market', 'box', 'sky', 'jean', 'shirt', 'leg', 'bag', 'flag', 'hat', 'pole', 'outdoor', 'banner', 'boot', 'shoe', 'umbrella', 'backpack', 'banana', 'crate']
2022-03-17 09:08:50,328.328 2829:trainer.py:487 do_train_dict(): eta: 4:00:17  iter: 58200  speed: 268.6 images/sec  total_norm: 148.6850 (150.6591)  loss: 136.6240 (137.9547)  masked_loss: 1.4319 (1.4384)  tag_loss: 135.3579 (136.5163)  time: 1.4333 (1.9064)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4282 (1.9012)  save_time: 8.8421 (14.7395)  lr: 0.000012  max mem: 26307
2022-03-17 09:08:50,687.687 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-17 09:08:50,687.687 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 143.49464416503906
2022-03-17 09:08:50,687.687 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.6921262463002
2022-03-17 09:09:20,177.177 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023391637951135635
2022-03-17 09:09:20,178.178 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:09:20,178.178 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'group', '[MASK]', 'goats', 'sitting', 'on', '[MASK]', 'green', 'grass', 'beside', 'a', 'body', 'of', 'water', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:09:20,193.193 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'grass', 'gravel', 'rock', 'plant', 'ground', 'trunk', 'wood', 'field', 'forest', 'hill', 'animal', 'sheep', '[UNK]', 'pond', 'branch', 'bush', 'head', 'shadow', 'dirt', 'goat', 'cow', 'fence', 'grassy', 'leg', 'flower', 'hole', 'horse', 'water', 'donkey', 'stick', 'area', 'post', 'green', 'lush', 'leaf', 'lamb', 'grazing', 'group', 'roof', 'wall', 'herd', 'open', 'ear', 'white', 'pine', 'large', 'log', 'elephant', 'road']
2022-03-17 09:09:36,061.061 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'group', 'water', 'body', 'ground', 'rock', 'green', 'hill', 'forest', 'plant', 'tree', 'wood', 'branch', 'shadow', 'grass', 'bush', 'dirt', 'trunk', 'sheep', 'gravel', 'cow', 'goat', 'lush']
2022-03-17 09:11:59,523.523 2829:trainer.py:487 do_train_dict(): eta: 3:57:23  iter: 58300  speed: 270.6 images/sec  total_norm: 149.2654 (152.6587)  loss: 137.8673 (138.9792)  masked_loss: 1.4236 (1.3977)  tag_loss: 136.4411 (137.5815)  time: 1.4326 (1.8920)  data: 0.0002 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4276 (1.8870)  save_time: 8.8421 (14.7395)  lr: 0.000012  max mem: 26307
2022-03-17 09:11:59,883.883 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-17 09:11:59,884.884 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.43861389160156
2022-03-17 09:11:59,884.884 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.69305378770176
2022-03-17 09:12:29,420.420 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023406347259879112
2022-03-17 09:12:29,420.420 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:12:29,421.421 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'very', 'small', 'and', '[MASK]', 'kitchen', 'sits', 'upwards', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:12:29,436.436 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['[UNK]', 'kitchen', 'cabinet', 'window', 'sink', 'handle', 'stove', 'towel', 'curtain', 'wall', 'microwave', 'door', 'refrigerator', 'ceiling', 'oven', 'bottle', 'light', 'drawer', 'floor', 'pot', 'container', 'cup', 'rack', 'bowl', 'paper', 'magnet', 'bag', 'basket', 'sponge', 'top', 'knob', 'knife', 'picture', 'shelf', 'clock', 'maker', 'plant', 'outlet', 'dish', 'vase', 'jar', 'cord', 'glove', 'mug', 'glass', 'kettle', 'plate', 'spoon', 'counter', 'green']
2022-03-17 09:12:45,429.429 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'small', 'top', 'door', 'light', 'cup', 'wall', 'paper', 'window', 'kitchen', 'clean', 'handle', 'cabinet', 'bottle', 'ceiling', 'sink', 'pot', 'towel', 'curtain', 'shelf', 'container', 'drawer', 'mug', 'spoon', 'glove', 'stove', 'knob', 'oven', 'refrigerator', 'microwave']
03-17 09:13:26.292 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 09:13:26.292 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 09:13:27.560 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 09:15:09,114.114 2829:trainer.py:487 do_train_dict(): eta: 3:54:29  iter: 58400  speed: 270.1 images/sec  total_norm: 148.1431 (150.0670)  loss: 137.8973 (138.9593)  masked_loss: 1.3678 (1.4088)  tag_loss: 136.2453 (137.5505)  time: 1.4342 (1.8959)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4291 (1.8908)  save_time: 8.8421 (14.7395)  lr: 0.000012  max mem: 26307
2022-03-17 09:15:09,475.475 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-17 09:15:09,475.475 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 107.7069091796875
2022-03-17 09:15:09,475.475 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.69959824390901
2022-03-17 09:15:38,681.681 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02340935356914997
2022-03-17 09:15:38,681.681 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:15:38,682.682 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'older', 'woman', 'walks', 'in', 'the', 'rain', 'with', '[MASK]', 'umbrella', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:15:38,697.697 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['umbrella', 'hand', 'sidewalk', 'jacket', 'woman', 'person', 'building', 'ground', 'rain', 'line', 'coat', 'shoe', 'bag', 'mouth', '[UNK]', 'purse', 'window', 'street', 'face', 'man', 'head', 'skirt', 'sign', 'leg', 'hair', 'dress', 'handle', 'pole', 'hat', 'reflection', 'foot', 'road', 'watch', 'light', 'jean', 'car', 'glasses', 'lady', 'curb', 'fence', 'tree', 'rainy', 'strap', 'door', 'tire', 'city', 'shirt', 'boot', 'wall', 'store']
2022-03-17 09:15:54,589.589 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'face', 'line', 'building', 'door', 'road', 'street', 'woman', 'car', 'ground', 'hair', 'mouth', 'person', 'foot', 'tree', 'watch', 'bus', 'leg', 'dress', 'bag', 'rain', 'truck', 'coat', 'jacket', 'fence', 'purse', 'skirt', 'shoe', 'sidewalk', 'umbrella']
2022-03-17 09:18:18,547.547 2829:trainer.py:487 do_train_dict(): eta: 3:51:36  iter: 58500  speed: 270.3 images/sec  total_norm: 149.4313 (151.9876)  loss: 141.6673 (139.2598)  masked_loss: 1.3650 (1.3889)  tag_loss: 140.1983 (137.8709)  time: 1.4327 (1.8943)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4275 (1.8891)  save_time: 8.8421 (14.7395)  lr: 0.000012  max mem: 26307
2022-03-17 09:18:18,910.910 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.8529411554336548
2022-03-17 09:18:18,910.910 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 125.83739471435547
2022-03-17 09:18:18,910.910 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.71227444723604
2022-03-17 09:18:48,264.264 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023428311571478844
2022-03-17 09:18:48,264.264 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:18:48,265.265 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'dog', 'that', 'is', 'laying', 'between', '[MASK]', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:18:48,280.280 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'jean', 'dog', 'hair', 'head', 'couch', 'boy', 'hand', 'sock', 'leg', 'remote', 'ear', 'person', 'floor', 'eye', 'blanket', 'pillow', 'wall', 'shoe', 'collar', '[UNK]', 'nose', 'face', 'carpet', 'mouth', 'child', 'paw', 'tail', 'control', 'bed', 'foot', 'woman', 'cushion', 'rug', 'chair', 'girl', 'glasses', 'arm', 'sweater', 'man', 'controller', 'sofa', 'kid', 'book', 'young', 'knee', 'door', 'table', 'baby', 'laptop']
2022-03-17 09:19:04,246.246 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'hand', 'face', 'book', 'hair', 'girl', 'mouth', 'child', 'bed', 'table', 'wall', 'boy', 'magazine', 'eye', 'jean', 'shirt', 'dog', 'leg', 'ear', 'kid', 'tail', 'couch', 'mouse', 'keyboard', 'collar', 'pillow', 'shoe', 'cushion', 'sock']
2022-03-17 09:21:27,904.904 2829:trainer.py:487 do_train_dict(): eta: 3:48:42  iter: 58600  speed: 270.4 images/sec  total_norm: 150.2085 (153.8813)  loss: 138.6582 (138.3724)  masked_loss: 1.4537 (1.4222)  tag_loss: 137.0674 (136.9502)  time: 1.4316 (1.8936)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4264 (1.8883)  save_time: 8.8421 (14.7395)  lr: 0.000012  max mem: 26307
2022-03-17 09:21:28,265.265 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.625
2022-03-17 09:21:28,266.266 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 124.38931274414062
2022-03-17 09:21:28,266.266 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.72021060014299
2022-03-17 09:21:57,634.634 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02346744015812874
2022-03-17 09:21:57,634.634 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:21:57,634.634 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'men', 'in', 'a', 'batting', 'position', 'playing', 'baseball', 'in', '[MASK]', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:21:57,650.650 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['dirt', 'field', 'helmet', 'shoe', 'grass', '[UNK]', 'line', 'man', 'bat', 'catcher', 'uniform', 'shirt', 'glove', 'leg', 'plate', 'mask', 'batter', 'home', 'umpire', 'player', 'belt', 'baseball', 'jersey', 'head', 'ground', 'hand', 'number', 'fence', 'game', 'base', 'guard', 'wall', 'box', 'shin', 'arm', 'person', 'stand', 'hat', 'ready', 'camera', 'ball', 'cooler', 'pitch', 'face', 'name', 'swing', 'sock', 'stripe', 'chair', 'spectator']
2022-03-17 09:22:13,615.615 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'name', 'home', 'line', 'player', 'field', 'position', 'ground', 'baseball', 'shirt', 'jersey', 'leg', 'plate', 'grass', 'belt', 'uniform', 'dirt', 'bat', 'mask', 'batting', 'helmet', 'shoe', 'catcher', 'glove', 'batter']
2022-03-17 09:24:37,370.370 2829:trainer.py:487 do_train_dict(): eta: 3:45:48  iter: 58700  speed: 270.2 images/sec  total_norm: 148.3143 (152.4183)  loss: 136.0296 (137.7186)  masked_loss: 1.4282 (1.4334)  tag_loss: 134.8833 (136.2852)  time: 1.4313 (1.8946)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4260 (1.8890)  save_time: 8.8421 (14.7395)  lr: 0.000012  max mem: 26307
2022-03-17 09:24:37,730.730 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.44117647409439087
2022-03-17 09:24:37,730.730 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 118.98245239257812
2022-03-17 09:24:37,730.730 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.72977999602857
2022-03-17 09:25:06,934.934 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02347312681376934
2022-03-17 09:25:06,934.934 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:25:06,935.935 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'zebra', '##s', 'in', 'their', 'pen', 'some', '[MASK]', 'a', 'fence', 'and', 'tree', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:25:06,950.950 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'trunk', 'zebra', 'ground', 'shadow', 'pole', 'post', 'leaf', 'leg', 'fence', '[UNK]', 'grass', 'trough', 'dirt', 'head', 'rock', 'box', 'wood', 'enclosure', 'board', 'tail', 'branch', 'stripe', 'zoo', 'ear', 'building', 'mane', 'food', 'log', 'shade', 'wooden', 'bench', 'hay', 'roof', 'wall', 'cart', 'structure', 'next', 'other', 'bush', 'area', 'sign', 'feeder', 'door', 'basket', 'bin', 'crate', 'group', 'couple', 'horn']
2022-03-17 09:25:22,941.941 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'building', 'ground', 'post', 'food', 'tree', 'leg', 'gate', 'shadow', 'grass', 'pole', 'dirt', 'leaf', 'pen', 'trunk', 'fence', 'shade', 'zoo', 'trough', 'zebra']
2022-03-17 09:27:47,126.126 2829:trainer.py:487 do_train_dict(): eta: 3:42:54  iter: 58800  speed: 269.8 images/sec  total_norm: 148.7619 (152.6464)  loss: 137.1708 (137.0621)  masked_loss: 1.4380 (1.4223)  tag_loss: 135.5548 (135.6398)  time: 1.4323 (1.8976)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4271 (1.8925)  save_time: 8.8421 (14.7395)  lr: 0.000011  max mem: 26307
2022-03-17 09:27:47,488.488 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7142857313156128
2022-03-17 09:27:47,488.488 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 126.76809692382812
2022-03-17 09:27:47,488.488 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.73064879205312
2022-03-17 09:28:16,718.718 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023475831374526024
2022-03-17 09:28:16,718.718 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:28:16,719.719 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bethany', 'clock', 'on', 'the', 'outside', 'of', 'a', '[MASK]', 'building', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:28:16,734.734 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'clock', 'wall', '[UNK]', 'sun', 'star', 'window', 'number', 'painting', 'statue', 'roof', 'wire', 'arch', 'archway', 'sky', 'circle', 'balcony', 'tower', 'hand', 'roman', 'pipe', 'large', 'sculpture', 'column', 'spire', 'lion', 'decoration', 'side', 'sign', 'light', 'design', 'pillar', 'art', 'wing', 'pole', 'cross', 'bird', 'brick', 'big', 'ceiling', 'city', 'railing', 'street', 'door', 'picture', 'reflection', 'tree', 'bridge', 'ornate', 'sword']
2022-03-17 09:28:32,685.685 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['number', 'building', 'large', 'light', 'outside', 'star', 'wall', 'sun', 'window', 'sky', 'roof', 'circle', 'clock', 'concrete', 'statue', 'arch', 'balcony', 'archway']
2022-03-17 09:30:56,671.671 2829:trainer.py:487 do_train_dict(): eta: 3:40:01  iter: 58900  speed: 270.1 images/sec  total_norm: 149.1548 (153.4005)  loss: 138.1637 (140.4498)  masked_loss: 1.3448 (1.4004)  tag_loss: 136.5350 (139.0494)  time: 1.4325 (1.8954)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4271 (1.8902)  save_time: 8.8421 (14.7395)  lr: 0.000011  max mem: 26307
2022-03-17 09:30:57,032.032 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.44736841320991516
2022-03-17 09:30:57,032.032 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.7872314453125
2022-03-17 09:30:57,032.032 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.73392271268166
2022-03-17 09:31:26,801.801 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02353179268538952
2022-03-17 09:31:26,801.801 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:31:26,802.802 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'tile', 'floor', 'in', 'an', '[MASK]', 'kitchen', 'with', '[MASK]', 'doors', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:31:26,817.817 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['floor', 'door', 'plant', 'wall', 'window', '[UNK]', 'kitchen', 'table', 'tile', 'fence', 'cabinet', 'curtain', 'towel', 'tree', 'vase', 'leg', 'railing', 'ceiling', 'rack', 'chair', 'handle', 'bowl', 'switch', 'outlet', 'sink', 'rug', 'patio', 'dish', 'shelf', 'balcony', 'house', 'mat', 'pot', 'cloth', 'light', 'plate', 'room', 'refrigerator', 'board', 'oven', 'drawer', 'basket', 'top', 'shoe', 'counter', 'stove', 'cutting', 'picture', 'blind', 'flower']
2022-03-17 09:31:42,835.835 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'house', 'open', 'door', 'floor', 'table', 'wall', 'plant', 'window', 'tree', 'kitchen', 'picture', 'leg', 'bowl', 'handle', 'cabinet', 'ceiling', 'flower', 'switch', 'sink', 'cloth', 'fence', 'towel', 'curtain', 'balcony', 'mat', 'tile', 'rack', 'railing', 'microwave', 'vase', 'patio', 'rug']
2022-03-17 09:34:06,073.073 2829:trainer.py:487 do_train_dict(): eta: 3:37:07  iter: 59000  speed: 270.3 images/sec  total_norm: 149.2645 (151.1273)  loss: 137.3948 (139.3826)  masked_loss: 1.4110 (1.4411)  tag_loss: 135.7160 (137.9415)  time: 1.4325 (1.8941)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4271 (1.8888)  save_time: 8.8421 (14.7395)  lr: 0.000011  max mem: 26307
2022-03-17 09:34:06,435.435 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7428571581840515
2022-03-17 09:34:06,435.435 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 136.85707092285156
2022-03-17 09:34:06,435.435 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.73886231682224
2022-03-17 09:34:36,152.152 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02352909743785858
2022-03-17 09:34:36,152.152 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:34:36,152.152 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'four', 'tennis', 'players', 'are', 'competing', 'on', 'a', '[MASK]', 'court', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:34:36,168.168 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'court', 'net', 'tennis', '[UNK]', 'short', 'man', 'line', 'hair', 'shoe', 'fence', 'person', 'leg', 'shadow', 'hand', 'player', 'head', 'pole', 'woman', 'arm', 'hat', 'sign', 'outfit', 'tree', 'wall', 'chair', 'girl', 'ball', 'boy', 'sky', 'sock', 'match', 'uniform', 'skirt', 'dress', 'couple', 'top', 'stand', 'cloud', 'light', 'bag', 'game', 'car', 'playing', 'ground', 'building', 'cap', 'background', 'group', 'house']
2022-03-17 09:34:52,130.130 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'line', 'woman', 'court', 'short', 'hair', 'girl', 'arm', 'shirt', 'leg', 'dress', 'tennis', 'shadow', 'net', 'fence', 'shoe', 'outfit']
2022-03-17 09:37:15,684.684 2829:trainer.py:487 do_train_dict(): eta: 3:34:13  iter: 59100  speed: 270.0 images/sec  total_norm: 149.6712 (150.8104)  loss: 141.0224 (139.9859)  masked_loss: 1.3148 (1.4002)  tag_loss: 139.3551 (138.5858)  time: 1.4316 (1.8960)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4264 (1.8909)  save_time: 8.8421 (14.7395)  lr: 0.000011  max mem: 26307
2022-03-17 09:37:16,049.049 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7352941036224365
2022-03-17 09:37:16,050.050 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 140.25619506835938
2022-03-17 09:37:16,050.050 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.74285661207664
2022-03-17 09:37:45,809.809 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023554671555757523
2022-03-17 09:37:45,810.810 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:37:45,810.810 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'laptop', 'sits', 'on', 'the', 'edge', 'of', '##lio', 'counter', 'with', 'chairs', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:37:45,825.825 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['floor', 'laptop', 'table', 'couch', 'door', 'keyboard', 'room', 'stool', 'rug', 'screen', 'pillow', 'bag', 'tile', 'wall', 'computer', 'leg', 'book', 'mouse', 'cup', 'coffee', 'shelf', '[UNK]', 'magazine', 'sofa', 'cord', 'top', 'living', 'chair', 'stand', 'remote', 'can', 'mug', 'cushion', 'antenna', 'seat', 'bowl', 'blanket', 'handle', 'tray', 'monitor', 'television', 'plate', 'box', 'small', 'purse', 'ottoman', 'bottle', 'paper', 'picture', 'backpack']
2022-03-17 09:38:01,735.735 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'room', 'top', 'book', 'door', 'cup', 'living', 'floor', 'table', 'wall', 'magazine', 'computer', 'edge', 'screen', 'coffee', 'leg', 'bag', 'counter', 'plate', 'bottle', 'couch', 'mouse', 'purse', 'keyboard', 'pillow', 'sofa', 'shelf', 'laptop', 'tile', 'stool', 'rug']
2022-03-17 09:40:25,225.225 2829:trainer.py:487 do_train_dict(): eta: 3:31:19  iter: 59200  speed: 270.1 images/sec  total_norm: 147.8494 (150.2399)  loss: 139.1447 (140.8926)  masked_loss: 1.4374 (1.4575)  tag_loss: 137.5555 (139.4351)  time: 1.4316 (1.8954)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4262 (1.8903)  save_time: 8.8421 (14.7395)  lr: 0.000011  max mem: 26307
2022-03-17 09:40:25,587.587 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.59375
2022-03-17 09:40:25,587.587 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.48828125
2022-03-17 09:40:25,587.587 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.75340359126537
2022-03-17 09:40:55,144.144 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02355773001909256
2022-03-17 09:40:55,146.146 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:40:55,146.146 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'hot', '##dog', 'is', 'being', 'eaten', 'by', 'a', 'man', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:40:55,161.161 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hat', 'cap', 'man', 'hair', 'nose', 'head', 'hand', 'dog', 'ear', 'hot', 'jacket', 'woman', 'eye', 'window', 'sign', 'person', 'building', '[UNK]', 'bun', 'foil', 'sweater', 'face', 'door', 'shirt', 'scarf', 'mouth', 'thumb', 'coat', 'light', 'logo', 'letter', 'reflection', 'wall', 'finger', 'sunglasses', 'paper', 'vest', 'backpack', 'girl', 'bag', 'jean', 'food', 'pole', 'glasses', 'strap', 'zipper', 'store', 'sleeve', 'number', 'collar']
2022-03-17 09:41:11,178.178 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'building', 'door', 'woman', 'hair', 'person', 'hot', 'eye', 'window', 'sign', 'shirt', 'dog', 'nose', 'ear', 'hat', 'cap', 'jacket', 'bow', 'sweater', 'foil', 'scarf', 'bun']
03-17 09:43:27.657 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 09:43:27.657 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 09:43:28.924 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 09:43:34,721.721 2829:trainer.py:487 do_train_dict(): eta: 3:28:25  iter: 59300  speed: 270.2 images/sec  total_norm: 148.4815 (152.0117)  loss: 141.4568 (140.8736)  masked_loss: 1.4203 (1.4649)  tag_loss: 139.8029 (139.4087)  time: 1.4312 (1.8950)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4260 (1.8898)  save_time: 8.8421 (14.7395)  lr: 0.000011  max mem: 26307
2022-03-17 09:43:35,079.079 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-17 09:43:35,080.080 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 146.22079467773438
2022-03-17 09:43:35,080.080 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.7579834099972
2022-03-17 09:44:05,006.006 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023516250774264336
2022-03-17 09:44:05,006.006 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:44:05,007.007 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'row', 'of', 'red', 'valves', '[MASK]', 'dotted', 'among', 'some', 'shrubs', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:44:05,022.022 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['fire', '[UNK]', 'cap', 'bolt', 'wall', 'bush', 'plant', 'top', 'leaf', 'red', 'base', 'knob', 'flower', 'bottom', 'background', 'green', 'chain', 'line', 'foliage', 'sign', 'stem', 'window', 'wheel', 'next', 'ground', 'blue', 'tree', 'reflection', 'rock', 'plug', 'group', 'writing', 'block', 'building', 'dirt', 'toy', 'water', 'nut', 'pipe', 'front', 'side', 'garden', 'silver', 'tile', 'lid', 'picture', 'letter', 'handle', 'area', 'fence']
2022-03-17 09:44:20,969.969 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'line', 'top', 'red', 'fire', 'wall', 'base', 'plant', 'row', 'bush', 'cap', 'flower', 'bolt']
2022-03-17 09:46:44,329.329 2829:trainer.py:487 do_train_dict(): eta: 3:25:30  iter: 59400  speed: 270.0 images/sec  total_norm: 148.0462 (153.2958)  loss: 137.4192 (139.5342)  masked_loss: 1.4298 (1.4557)  tag_loss: 135.9200 (138.0785)  time: 1.4325 (1.8961)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4272 (1.8909)  save_time: 8.8421 (14.7395)  lr: 0.000011  max mem: 26307
2022-03-17 09:46:44,690.690 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.529411792755127
2022-03-17 09:46:44,690.690 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 126.42683410644531
2022-03-17 09:46:44,691.691 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.75964920300396
2022-03-17 09:47:14,616.616 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023525401949882507
2022-03-17 09:47:14,616.616 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:47:14,617.617 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'leaning', 'up', 'against', 'a', 'wall', 'with', 'graffiti', 'and', 'text', '##ing', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:47:14,632.632 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'wall', 'short', 'hair', 'boy', 'girl', 'graffiti', 'head', 'hand', '[UNK]', 'person', 'sidewalk', 'man', 'shoe', 'ground', 'leg', 'woman', 'phone', 'sock', 'young', 'building', 'arm', 'group', 'flop', 'foot', 'design', 'window', 'face', 'flip', 'ear', 'shadow', 'floor', 'cell', 'painting', 'hat', 'sky', 'eye', 'sleeve', 'glasses', 'kid', 'drawing', 'tree', 'child', 'picture', 'sweater', 'leaf', 'bat', 'sign', 'curb', 'bench']
2022-03-17 09:47:30,605.605 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'group', 'door', 'woman', 'short', 'ground', 'hair', 'girl', 'person', 'wall', 'arm', 'boy', 'phone', 'eye', 'shirt', 'leg', 'boot', 'shoe', 'dot', 'sidewalk', 'graffiti']
2022-03-17 09:49:54,150.150 2829:trainer.py:487 do_train_dict(): eta: 3:22:36  iter: 59500  speed: 269.7 images/sec  total_norm: 150.6443 (153.0435)  loss: 136.9487 (137.0340)  masked_loss: 1.4257 (1.4436)  tag_loss: 135.8034 (135.5904)  time: 1.4324 (1.8982)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4271 (1.8930)  save_time: 8.8421 (14.7395)  lr: 0.000010  max mem: 26307
2022-03-17 09:49:54,511.511 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-17 09:49:54,512.512 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 145.71640014648438
2022-03-17 09:49:54,512.512 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.75877954495833
2022-03-17 09:50:24,304.304 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023510891944169998
2022-03-17 09:50:24,304.304 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:50:24,305.305 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'young', 'premiered', 'holding', 'a', 'tennis', 'ball', 'and', 'swinging', 'a', '[MASK]', 'ra', '##c', '##quet', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:50:24,320.320 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shoe', 'tennis', '[UNK]', 'short', 'court', 'leg', 'sock', 'hand', 'line', 'shirt', 'head', 'shadow', 'hair', 'woman', 'ground', 'arm', 'handle', 'player', 'face', 'logo', 'ball', 'hat', 'ponytail', 'stripe', 'ear', 'cap', 'man', 'string', 'mouth', 'nose', 'band', 'blue', 'girl', 'female', 'skirt', 'knee', 'letter', 'sleeve', 'game', 'young', 'wall', 'top', 'glasses', 'eye', 'wrist', 'male', 'ready', 'outfit', 'necklace', 'white']
2022-03-17 09:50:40,283.283 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'line', 'young', 'player', 'woman', 'court', 'short', 'ground', 'hair', 'arm', 'ball', 'shirt', 'leg', 'handle', 'tennis', 'string', 'shadow', 'shoe', 'ponytail', 'sock']
2022-03-17 09:53:03,904.904 2829:trainer.py:487 do_train_dict(): eta: 3:19:42  iter: 59600  speed: 269.8 images/sec  total_norm: 149.9381 (151.4218)  loss: 136.2480 (138.5462)  masked_loss: 1.4119 (1.4164)  tag_loss: 135.0492 (137.1298)  time: 1.4318 (1.8975)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4268 (1.8925)  save_time: 8.8421 (14.7395)  lr: 0.000010  max mem: 26307
2022-03-17 09:53:04,264.264 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5833333134651184
2022-03-17 09:53:04,264.264 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 158.49700927734375
2022-03-17 09:53:04,264.264 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.75044187229483
2022-03-17 09:53:34,275.275 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02350086346268654
2022-03-17 09:53:34,276.276 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:53:34,276.276 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'boat', 'sits', 'on', 'a', 'river', '[MASK]', 'green', 'trees', 'and', '[MASK]', 'around', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:53:34,291.291 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'reflection', 'water', 'boat', 'grass', 'window', 'canal', 'light', 'person', 'dock', 'river', 'path', 'bank', 'bottom', 'bridge', 'building', 'roof', 'bush', '[UNK]', 'shore', 'flower', 'stripe', 'body', 'house', 'car', 'trunk', 'door', 'wall', 'top', 'plant', 'flag', 'small', 'sign', 'pole', 'lamp', 'chair', 'tire', 'next', 'post', 'sidewalk', 'man', 'blue', 'sky', 'front', 'shirt', 'umbrella', 'duck', 'bumper', 'shadow', 'can']
2022-03-17 09:53:50,229.229 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['water', 'river', 'top', 'green', 'bank', 'window', 'tree', 'boat', 'canal', 'bush', 'reflection', 'foliage', 'bumper']
2022-03-17 09:56:13,820.820 2829:trainer.py:487 do_train_dict(): eta: 3:16:48  iter: 59700  speed: 269.6 images/sec  total_norm: 151.9569 (154.5600)  loss: 138.4757 (140.1799)  masked_loss: 1.3459 (1.3609)  tag_loss: 137.1900 (138.8190)  time: 1.4321 (1.8992)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4271 (1.8940)  save_time: 8.8421 (14.7395)  lr: 0.000010  max mem: 26307
2022-03-17 09:56:14,181.181 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-17 09:56:14,181.181 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 133.41464233398438
2022-03-17 09:56:14,181.181 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.7534015681034
2022-03-17 09:56:44,281.281 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023498503491282463
2022-03-17 09:56:44,281.281 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:56:44,282.282 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'street', 'vendor', 'is', 'sitting', 'under', 'an', 'umbrella', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:56:44,297.297 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['road', 'street', 'tire', 'ground', '[UNK]', 'car', 'truck', 'sidewalk', 'window', 'box', 'building', 'tree', 'light', 'sky', 'shirt', 'sign', 'pole', 'man', 'wheel', 'van', 'person', 'cart', 'fence', 'head', 'shoe', 'hair', 'door', 'trash', 'house', 'roof', 'food', 'line', 'hand', 'grass', 'can', 'bag', 'plate', 'woman', 'handle', 'wall', 'basket', 'lot', 'jean', 'curb', 'container', 'mirror', 'parking', 'bike', 'bicycle', 'windshield']
2022-03-17 09:57:00,380.380 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'line', 'building', 'road', 'street', 'cup', 'car', 'seat', 'cover', 'plant', 'tree', 'shirt', 'bowl', 'truck', 'mirror', 'pole', 'bike', 'pot', 'motorcycle', 'banner', 'basket', 'cart', 'sidewalk', 'tire', 'umbrella', 'bucket', 'rack', 'vendor']
2022-03-17 09:59:23,929.929 2829:trainer.py:487 do_train_dict(): eta: 3:13:54  iter: 59800  speed: 269.3 images/sec  total_norm: 148.0454 (149.6901)  loss: 134.8782 (135.0346)  masked_loss: 1.3867 (1.3873)  tag_loss: 133.4146 (133.6473)  time: 1.4325 (1.9011)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0051)  time_gpu: 1.4272 (1.8955)  save_time: 8.8421 (14.7395)  lr: 0.000010  max mem: 26307
2022-03-17 09:59:24,292.292 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5555555820465088
2022-03-17 09:59:24,292.292 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 165.06137084960938
2022-03-17 09:59:24,292.292 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.75623725929324
2022-03-17 09:59:54,405.405 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02354694902896881
2022-03-17 09:59:54,405.405 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 09:59:54,405.405 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'and', 'son', 'joking', 'around', 'while', '[MASK]', 'the', 'park', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 09:59:54,420.420 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'woman', 'hair', 'hand', '[UNK]', 'face', 'head', 'nose', 'ear', 'arm', 'eye', 'man', 'camera', 'mouth', 'top', 'glasses', 'person', 'tank', 'grass', 'girl', 'sunglasses', 'tree', 'boy', 'leaf', 'hat', 'sleeve', 'couple', 'jean', 'flower', 'plate', 'dress', 'young', 'button', 'cap', 'pocket', 'short', 'vest', 'tattoo', 'design', 'wall', 'watch', 'white', 'black', 'next', 'pole', 'plant', 'microphone', 'lid', 'shoe', 'drum']
2022-03-17 10:00:10,409.409 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'top', 'son', 'park', 'woman', 'hair', 'girl', 'arm', 'boy', 'eye', 'tree', 'shirt', 'nose', 'ear', 'tank', 'bottle', 'disc', 'glasses', 'sunglasses']
2022-03-17 10:02:33,801.801 2829:trainer.py:487 do_train_dict(): eta: 3:10:59  iter: 59900  speed: 269.7 images/sec  total_norm: 149.5469 (152.0728)  loss: 139.1802 (139.9196)  masked_loss: 1.4383 (1.4619)  tag_loss: 137.6618 (138.4577)  time: 1.4319 (1.8987)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4270 (1.8937)  save_time: 8.8421 (14.7395)  lr: 0.000010  max mem: 26307
2022-03-17 10:02:34,164.164 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-17 10:02:34,164.164 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 123.70503234863281
2022-03-17 10:02:34,164.164 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.76532523473104
2022-03-17 10:03:04,303.303 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02358384057879448
2022-03-17 10:03:04,303.303 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:03:04,303.303 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'shelf', 'with', 'bowls', 'lined', '[MASK]', '[MASK]', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:03:04,319.319 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bowl', 'wall', 'shelf', 'vase', 'pot', '[UNK]', 'handle', 'bird', 'plant', 'pitcher', 'table', 'rim', 'design', 'ledge', 'bathroom', 'bowls', 'line', 'cup', 'base', 'frame', 'jug', 'leaf', 'flower', 'lid', 'mirror', 'top', 'window', 'ceramic', 'container', 'outlet', 'picture', 'wood', 'jar', 'cabinet', 'book', 'tree', 'item', 'bottom', 'door', 'toilet', 'stem', 'white', 'light', 'paper', 'fruit', 'counter', 'coffee', 'tea', 'sink', 'mug']
2022-03-17 10:03:20,278.278 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'wall', 'paper', 'plant', 'coffee', 'bowl', 'bird', 'handle', 'leaf', 'item', 'pot', 'shelf', 'lid', 'bowls', 'jar', 'vase', 'jug']
2022-03-17 10:05:43,614.614 2829:trainer.py:487 do_train_dict(): eta: 3:08:05  iter: 60000  speed: 269.7 images/sec  total_norm: 148.3866 (150.4786)  loss: 138.7343 (141.0275)  masked_loss: 1.3430 (1.4124)  tag_loss: 137.4193 (139.6151)  time: 1.4305 (1.8981)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4253 (1.8930)  save_time: 8.8421 (14.7395)  lr: 0.000010  max mem: 26307
2022-03-17 10:05:43,617.617 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt
2022-03-17 10:05:53,024.024 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-17 10:05:53,024.024 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 143.82579040527344
2022-03-17 10:05:53,024.024 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.76604342976347
2022-03-17 10:06:23,003.003 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02360193431377411
2022-03-17 10:06:23,003.003 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:06:23,004.004 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'holding', 'up', 'a', '[MASK]', 'phone', 'at', 'a', 'desk', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:06:23,019.019 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['screen', 'wall', 'tablet', 'laptop', 'thumb', 'hand', 'keyboard', 'finger', 'phone', 'picture', 'computer', 'button', 'person', 'key', 'nail', 'desk', 'cover', '[UNK]', 'cord', 'icon', 'letter', 'logo', 'cell', 'train', 'game', 'wire', 'speaker', 'case', 'monitor', 'device', 'frame', 'text', 'man', 'building', 'remote', 'front', 'box', 'ring', 'floor', 'table', 'ship', 'iphone', 'black', 'wheel', 'smart', 'palm', 'container', 'open', 'paper', 'writing']
2022-03-17 10:06:38,857.857 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'person', 'table', 'wall', 'phone', 'key', 'paper', 'computer', 'cell', 'picture', 'screen', 'finger', 'desk', 'button', 'remote', 'thumb', 'monitor', 'logo', 'keyboard', 'laptop', 'tablet']
2022-03-17 10:09:01,521.521 2829:trainer.py:487 do_train_dict(): eta: 3:05:11  iter: 60100  speed: 258.7 images/sec  total_norm: 149.0059 (152.4156)  loss: 136.8327 (137.3492)  masked_loss: 1.3748 (1.3992)  tag_loss: 135.5767 (135.9500)  time: 1.4303 (1.9791)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4252 (1.8836)  save_time: 8.8421 (14.2643)  lr: 0.000010  max mem: 26307
2022-03-17 10:09:01,882.882 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-17 10:09:01,883.883 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 136.20635986328125
2022-03-17 10:09:01,883.883 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.77072749977492
2022-03-17 10:09:31,935.935 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023635121062397957
2022-03-17 10:09:31,936.936 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:09:31,936.936 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'line', 'of', 'two', '[MASK]', 'parked', 'on', 'the', 'side', 'of', '[MASK]', 'road', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:09:31,952.952 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['road', 'bus', 'street', 'sidewalk', 'window', 'pole', 'tire', 'line', 'building', '[UNK]', 'person', 'windshield', 'sign', 'post', 'tree', 'door', 'wheel', 'light', 'plate', 'sky', 'front', 'fence', 'man', 'jacket', 'car', 'license', 'curb', 'woman', 'shirt', 'mirror', 'bag', 'city', 'wall', 'lamp', 'roof', 'jean', 'coat', 'hair', 'bench', 'stop', 'number', 'bush', 'railing', 'yellow', 'can', 'driver', 'logo', 'bike', 'trash', 'motorcycle']
2022-03-17 10:09:47,783.783 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'man', 'side', 'line', 'building', 'road', 'street', 'light', 'post', 'person', 'window', 'sign', 'bus', 'traffic', 'flag', 'wheel', 'panel', 'pole', 'banner', 'lamp', 'balcony', 'sidewalk', 'tire', 'windshield']
2022-03-17 10:12:11,491.491 2829:trainer.py:487 do_train_dict(): eta: 3:02:17  iter: 60200  speed: 269.5 images/sec  total_norm: 148.8860 (151.3608)  loss: 138.3056 (138.7536)  masked_loss: 1.3577 (1.3690)  tag_loss: 137.4783 (137.3847)  time: 1.4317 (1.8997)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4265 (1.8945)  save_time: 8.8421 (14.2643)  lr: 0.000009  max mem: 26307
2022-03-17 10:12:11,850.850 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6756756901741028
2022-03-17 10:12:11,851.851 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.8243408203125
2022-03-17 10:12:11,851.851 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.77542744979732
2022-03-17 10:12:42,016.016 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023659884929656982
2022-03-17 10:12:42,017.017 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:12:42,017.017 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'sitting', 'at', 'a', 'kitchen', 'table', 'opens', 'wide', 'to', 'take', 'the', 'first', 'bite', '[MASK]', 'a', 'sandwich', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:12:42,032.032 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['candle', 'table', 'hair', 'plate', 'bottle', 'man', 'shirt', 'window', 'hand', 'sweater', 'flame', 'restaurant', '[UNK]', 'mouth', 'sandwich', 'light', 'wall', 'fork', 'water', 'food', 'glass', 'napkin', 'holder', 'label', 'bowl', 'woman', 'head', 'chair', 'bread', 'hot', 'person', 'liquid', 'knife', 'nose', 'face', 'tattoo', 'ring', 'ceiling', 'cake', 'straw', 'bun', 'arm', 'dog', 'salt', 'spoon', 'container', 'flower', 'room', 'front', 'boy']
2022-03-17 10:12:58,049.049 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'first', 'man', 'hand', 'water', 'light', 'hair', 'mouth', 'person', 'table', 'food', 'wide', 'hot', 'window', 'shirt', 'kitchen', 'dog', 'teeth', 'restaurant', 'plate', 'cabinet', 'bottle', 'bite', 'bread', 'flame', 'holder', 'fork', 'lighter', 'towel', 'sandwich', 'tattoo', 'candle', 'sweater', 'napkin']
03-17 10:13:28.973 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 10:13:28.973 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 10:13:30.235 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 10:15:21,591.591 2829:trainer.py:487 do_train_dict(): eta: 2:59:22  iter: 60300  speed: 269.3 images/sec  total_norm: 149.8289 (150.9457)  loss: 136.7347 (137.5155)  masked_loss: 1.4130 (1.4382)  tag_loss: 135.4882 (136.0772)  time: 1.4320 (1.9010)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0050)  time_gpu: 1.4267 (1.8959)  save_time: 8.8421 (14.2643)  lr: 0.000009  max mem: 26307
2022-03-17 10:15:21,951.951 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7428571581840515
2022-03-17 10:15:21,951.951 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 117.19361877441406
2022-03-17 10:15:21,952.952 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.77935245968648
2022-03-17 10:15:51,924.924 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023659594357013702
2022-03-17 10:15:51,925.925 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:15:51,925.925 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'dog', 'that', '[MASK]', 'herd', '##ing', 'two', 'sheep', 'while', 'people', 'watch', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:15:51,941.941 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['person', 'sheep', 'grass', 'dog', 'chair', '[UNK]', 'hat', 'shirt', 'man', 'jacket', 'woman', 'fence', 'field', 'pole', 'head', 'leg', 'bag', 'jean', 'shoe', 'flag', 'umbrella', 'hair', 'tree', 'stand', 'face', 'wool', 'ground', 'boy', 'sign', 'child', 'tail', 'girl', 'spectator', 'line', 'cap', 'table', 'lamb', 'dirt', 'group', 'coat', 'horn', 'hand', 'helmet', 'animal', 'wall', 'post', 'goat', 'sky', 'net', 'bat']
2022-03-17 10:16:07,853.853 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'line', 'woman', 'field', 'hair', 'person', 'table', 'boy', 'chair', 'jean', 'shirt', 'dog', 'animal', 'leg', 'grass', 'hat', 'jacket', 'bench', 'dirt', 'sheep', 'fence']
2022-03-17 10:18:31,835.835 2829:trainer.py:487 do_train_dict(): eta: 2:56:28  iter: 60400  speed: 269.1 images/sec  total_norm: 148.0683 (151.8012)  loss: 140.3271 (140.4253)  masked_loss: 1.3901 (1.4215)  tag_loss: 138.8503 (139.0038)  time: 1.4316 (1.9024)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4263 (1.8973)  save_time: 8.8421 (14.2643)  lr: 0.000009  max mem: 26307
2022-03-17 10:18:32,196.196 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-17 10:18:32,196.196 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 117.75284576416016
2022-03-17 10:18:32,196.196 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.78705593771186
2022-03-17 10:19:02,443.443 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023665286600589752
2022-03-17 10:19:02,443.443 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:19:02,443.443 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'picture', 'of', '[MASK]', 'cows', 'eating', 'grass', '[MASK]', 'a', 'field', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:19:02,459.459 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'cow', 'tree', 'head', 'field', 'fence', 'leg', 'sky', 'tail', 'bush', 'face', 'plant', 'cloud', 'post', 'building', 'pasture', 'ear', 'green', 'grassy', 'ground', 'pole', 'flower', 'background', 'roof', '[UNK]', 'cattle', 'hill', 'house', 'leaf', 'grazing', 'weed', 'mountain', 'brown', 'white', 'herd', 'lush', 'top', 'animal', 'group', 'trunk', 'horse', 'bird', 'branch', 'sheep', 'barn', 'area', 'open', 'rock', 'horn', 'neck']
2022-03-17 10:19:18,392.392 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'field', 'post', 'tree', 'sky', 'picture', 'leg', 'grass', 'bush', 'cloud', 'fence', 'cow', 'weed']
2022-03-17 10:21:42,109.109 2829:trainer.py:487 do_train_dict(): eta: 2:53:33  iter: 60500  speed: 269.1 images/sec  total_norm: 147.7416 (151.3848)  loss: 138.9736 (142.1665)  masked_loss: 1.4013 (1.4407)  tag_loss: 137.5656 (140.7258)  time: 1.4303 (1.9028)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4251 (1.8975)  save_time: 8.8421 (14.2643)  lr: 0.000009  max mem: 26307
2022-03-17 10:21:42,470.470 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.46875
2022-03-17 10:21:42,470.470 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 160.2852783203125
2022-03-17 10:21:42,471.471 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.77744458768234
2022-03-17 10:22:12,979.979 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023684196174144745
2022-03-17 10:22:12,979.979 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:22:12,979.979 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'computer', 'keyboards', 'are', 'displayed', 'on', '[MASK]', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:22:12,995.995 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['desk', 'keyboard', 'table', 'key', 'mouse', 'cord', 'paper', 'wire', 'button', 'logo', 'computer', '[UNK]', 'wall', 'book', 'phone', 'plug', 'wooden', 'handle', 'base', 'black', 'pad', 'next', 'top', 'pen', 'telephone', 'cup', 'cable', 'ipod', 'box', 'speaker', 'mug', 'object', 'outlet', 'knob', 'writing', 'circle', 'light', 'monitor', 'white', 'camera', 'small', 'letter', 'coffee', 'floor', 'cap', 'container', 'board', 'other', 'bag', 'stand']
2022-03-17 10:22:28,894.894 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['book', 'table', 'phone', 'key', 'paper', 'computer', 'cell', 'desk', 'button', 'wire', 'mouse', 'logo', 'keyboard', 'cord']
2022-03-17 10:24:52,616.616 2829:trainer.py:487 do_train_dict(): eta: 2:50:38  iter: 60600  speed: 268.8 images/sec  total_norm: 147.0846 (149.8204)  loss: 137.7735 (137.5887)  masked_loss: 1.3666 (1.3823)  tag_loss: 135.8464 (136.2064)  time: 1.4323 (1.9050)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4270 (1.8998)  save_time: 8.8421 (14.2643)  lr: 0.000009  max mem: 26307
2022-03-17 10:24:52,977.977 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.625
2022-03-17 10:24:52,977.977 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 134.27268981933594
2022-03-17 10:24:52,977.977 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.78060899888467
2022-03-17 10:25:23,499.499 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02371017448604107
2022-03-17 10:25:23,499.499 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:25:23,500.500 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'kids', 'in', 'a', '[MASK]', 'plays', 'with', 'a', '[MASK]', 'outside', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:25:23,515.515 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['grass', 'person', 'shirt', 'tree', 'man', 'ground', 'field', '[UNK]', 'short', 'leg', 'hat', 'head', 'hair', 'cap', 'shadow', 'arm', 'group', 'shoe', 'woman', 'hand', 'girl', 'park', 'background', 'dog', 'bag', 'boy', 'jersey', 'pole', 'bat', 'bird', 'ball', 'kite', 'baseball', 'child', 'sock', 'player', 'uniform', 'grassy', 'green', 'can', 'jean', 'game', 'crowd', 'jacket', 'bush', 'tail', 'trash', 'cone', 'soccer', 'flag']
2022-03-17 10:25:39,405.405 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'group', 'air', 'park', 'field', 'ground', 'hair', 'girl', 'person', 'child', 'boy', 'couple', 'tree', 'sky', 'jean', 'shirt', 'dress', 'string', 'shadow', 'grass', 'cloud', 'trunk', 'kit', 'skirt', 'kite']
2022-03-17 10:28:03,002.002 2829:trainer.py:487 do_train_dict(): eta: 2:47:44  iter: 60700  speed: 268.9 images/sec  total_norm: 149.5002 (151.8643)  loss: 136.4660 (136.8315)  masked_loss: 1.3754 (1.4175)  tag_loss: 135.2537 (135.4139)  time: 1.4309 (1.9039)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4258 (1.8988)  save_time: 8.8421 (14.2643)  lr: 0.000009  max mem: 26307
2022-03-17 10:28:03,362.362 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-17 10:28:03,362.362 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 122.81192016601562
2022-03-17 10:28:03,363.363 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.7867005687011
2022-03-17 10:28:33,927.927 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02372957579791546
2022-03-17 10:28:33,927.927 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:28:33,927.927 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'flock', 'of', 'sheep', 'climbing', 'up', '[MASK]', 'crest', 'of', '[MASK]', 'hill', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:28:33,943.943 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'sky', 'sheep', 'grass', 'head', 'leg', 'field', 'mountain', 'hill', 'ear', 'face', 'herd', 'background', 'group', 'bush', 'ground', 'wool', '[UNK]', 'tail', 'cow', 'grassy', 'grazing', 'bucket', 'open', 'lamb', 'trough', 'house', 'animal', 'container', 'building', 'green', 'standing', 'next', 'spot', 'rock', 'distance', 'pasture', 'stand', 'top', 'large', 'other', 'flock', 'hay', 'middle', 'nose', 'bunch', 'white', 'couple', 'area', 'food']
2022-03-17 10:28:49,908.908 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'group', 'face', 'field', 'hill', 'mountain', 'metal', 'tree', 'sky', 'leg', 'ear', 'grass', 'bush', 'sheep', 'crest', 'herd', 'flock']
2022-03-17 10:31:13,531.531 2829:trainer.py:487 do_train_dict(): eta: 2:44:49  iter: 60800  speed: 268.7 images/sec  total_norm: 149.1399 (151.9852)  loss: 136.9570 (137.5027)  masked_loss: 1.3578 (1.3858)  tag_loss: 135.9295 (136.1169)  time: 1.4315 (1.9053)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4262 (1.9002)  save_time: 8.8421 (14.2643)  lr: 0.000008  max mem: 26307
2022-03-17 10:31:13,892.892 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-17 10:31:13,892.892 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 154.19613647460938
2022-03-17 10:31:13,892.892 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.78898504723861
2022-03-17 10:31:44,388.388 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023804496973752975
2022-03-17 10:31:44,388.388 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:31:44,388.388 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'riding', 'a', 'speed', 'boat', 'on', '##wani', 'of', 'a', 'body', 'of', 'water', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:31:44,404.404 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['water', 'wall', 'boat', 'man', 'snow', 'person', 'plant', 'stripe', 'tree', 'fence', 'shirt', '[UNK]', 'jacket', 'head', 'bush', 'hair', 'ski', 'building', 'wave', 'hat', 'glove', 'rope', 'umbrella', 'woman', 'bottom', 'front', 'writing', 'hand', 'motor', 'pot', 'step', 'flag', 'top', 'car', 'handle', 'stair', 'windshield', 'vest', 'railing', 'number', 'small', 'ground', 'helmet', 'hood', 'coat', 'light', 'pole', 'dock', 'logo', 'window']
2022-03-17 10:32:00,322.322 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'water', 'body', 'top', 'front', 'person', 'wall', 'speed', 'plant', 'tree', 'border', 'bottom', 'boat', 'snow', 'jacket', 'hood', 'helmet', 'glove', 'stripe']
2022-03-17 10:34:24,122.122 2829:trainer.py:487 do_train_dict(): eta: 2:41:54  iter: 60900  speed: 268.6 images/sec  total_norm: 149.0155 (151.4459)  loss: 140.3364 (140.3380)  masked_loss: 1.4813 (1.4576)  tag_loss: 139.0472 (138.8804)  time: 1.4318 (1.9059)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4267 (1.9004)  save_time: 8.8421 (14.2643)  lr: 0.000008  max mem: 26307
2022-03-17 10:34:24,483.483 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-17 10:34:24,483.483 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 143.6436767578125
2022-03-17 10:34:24,484.484 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.79033829110568
2022-03-17 10:34:54,872.872 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0238014105707407
2022-03-17 10:34:54,873.873 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:34:54,874.874 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'horse', 'kicking', 'another', '[MASK]', 'in', 'the', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:34:54,889.889 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['leg', 'grass', 'fence', 'horse', 'tail', '[UNK]', 'head', 'mane', 'zebra', 'wood', 'wall', 'ground', 'log', 'animal', 'person', 'ear', 'short', 'pole', 'building', 'shirt', 'harness', 'man', 'field', 'hair', 'leaf', 'rope', 'gate', 'nose', 'shoe', 'eye', 'mouth', 'dog', 'jean', 'neck', 'window', 'rock', 'tree', 'jacket', 'bag', 'stick', 'enclosure', 'board', 'hand', 'wooden', 'roof', 'goat', 'legs', 'green', 'box', 'next']
2022-03-17 10:35:10,822.822 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'building', 'field', 'ground', 'hair', 'wall', 'goal', 'neck', 'horse', 'shirt', 'leg', 'roof', 'ear', 'grass', 'tail', 'pole', 'fence', 'log', 'shoe', 'harness', 'vest', 'mane', 'zebra']
2022-03-17 10:37:34,803.803 2829:trainer.py:487 do_train_dict(): eta: 2:38:59  iter: 61000  speed: 268.5 images/sec  total_norm: 149.0821 (151.2864)  loss: 140.4911 (140.9709)  masked_loss: 1.3454 (1.4337)  tag_loss: 139.1099 (139.5371)  time: 1.4306 (1.9067)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4256 (1.9016)  save_time: 8.8421 (14.2643)  lr: 0.000008  max mem: 26307
2022-03-17 10:37:35,164.164 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6969696879386902
2022-03-17 10:37:35,164.164 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 157.09637451171875
2022-03-17 10:37:35,164.164 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.79372715052529
2022-03-17 10:38:05,830.830 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023831909522414207
2022-03-17 10:38:05,830.830 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:38:05,830.830 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'mean', 'are', 'standing', 'in', 'the', 'open', 'door', '[unused53]', 'a', 'city', 'bus', '.', '##ei', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:38:05,845.845 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'head', 'train', 'roof', 'shirt', 'man', 'sky', 'door', 'hair', 'tree', 'bus', '[UNK]', 'stripe', 'woman', 'hand', 'face', 'car', 'person', 'number', 'letter', 'hat', 'sign', 'arm', 'top', 'skirt', 'dress', 'light', 'logo', 'jean', 'handle', 'track', 'building', 'mirror', 'leg', 'shadow', 'curtain', 'tire', 'pole', 'boy', 'seat', 'lady', 'bag', 'wheel', 'passenger', 'street', 'ground', 'jacket', 'scarf', 'cloud', 'shoe']
2022-03-17 10:38:21,741.741 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['city', 'head', 'man', 'hand', 'open', 'door', 'car', 'hair', 'person', 'boy', 'window', 'train', 'bar', 'tree', 'watch', 'box', 'letter', 'sign', 'sky', 'jean', 'shirt', 'bus', 'roof', 'passenger', 'billboard', 'handle', 'hat', 'pole', 'hood', 'logo', 'cab', 'taxi', 'trolley']
2022-03-17 10:40:45,330.330 2829:trainer.py:487 do_train_dict(): eta: 2:36:04  iter: 61100  speed: 268.7 images/sec  total_norm: 147.3039 (150.5710)  loss: 136.7030 (137.7486)  masked_loss: 1.3435 (1.3712)  tag_loss: 135.2228 (136.3774)  time: 1.4314 (1.9053)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4261 (1.9001)  save_time: 8.8421 (14.2643)  lr: 0.000008  max mem: 26307
2022-03-17 10:40:45,691.691 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-17 10:40:45,691.691 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 132.9183349609375
2022-03-17 10:40:45,691.691 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.7994185117335
2022-03-17 10:41:16,084.084 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023850420489907265
2022-03-17 10:41:16,085.085 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:41:16,086.086 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'large', '[MASK]', 'polar', '[MASK]', 'standing', 'on', 'a', 'icy', 'pool', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:41:16,102.102 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bear', 'ear', 'water', 'polar', 'head', 'nose', 'rock', 'leg', 'eye', 'mouth', 'snow', 'ground', 'paw', 'pool', 'shadow', 'face', 'claw', 'ice', 'fur', 'teeth', 'ball', 'wall', 'zoo', 'boulder', 'white', 'large', 'sand', 'object', 'stone', 'swimming', '[UNK]', 'tail', 'splash', 'bubble', 'background', 'big', 'grass', 'top', 'tongue', 'wave', 'next', 'toy', 'neck', 'standing', 'structure', 'tree', 'ledge', 'snout', 'foam', 'exhibit']
2022-03-17 10:41:32,029.029 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'face', 'water', 'large', 'white', 'ground', 'rock', 'mouth', 'eye', 'leg', 'nose', 'ear', 'bear', 'snow', 'pool', 'handle', 'polar', 'sidewalk', 'cone', 'boulder', 'icy', 'curb', 'weed']
03-17 10:43:30.336 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 10:43:30.336 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 10:43:31.430 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 97}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}]
2022-03-17 10:43:55,801.801 2829:trainer.py:487 do_train_dict(): eta: 2:33:09  iter: 61200  speed: 268.8 images/sec  total_norm: 148.4808 (151.6336)  loss: 137.6155 (138.9650)  masked_loss: 1.3438 (1.3520)  tag_loss: 136.1368 (137.6131)  time: 1.4304 (1.9048)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0050)  time_gpu: 1.4252 (1.8996)  save_time: 8.8421 (14.2643)  lr: 0.000008  max mem: 26307
2022-03-17 10:43:56,161.161 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-17 10:43:56,161.161 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 114.75300598144531
2022-03-17 10:43:56,161.161 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.80511071981458
2022-03-17 10:44:26,725.725 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023850349709391594
2022-03-17 10:44:26,726.726 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:44:26,726.726 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'white', 'toilet', 'with', 'a', 'wooden', 'lid', '[MASK]', 'toilet', 'paper', 'sitting', '[MASK]', 'top', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:44:26,741.741 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['toilet', 'wall', 'floor', 'paper', 'bathroom', 'roll', 'seat', 'pipe', 'tile', 'lid', 'bowl', 'holder', 'hole', 'handle', 'brush', '[UNK]', 'tank', 'line', 'carpet', 'water', 'bottle', 'wooden', 'ground', 'bar', 'wood', 'small', 'container', 'can', 'label', 'next', 'tape', 'trash', 'door', 'tube', 'white', 'metal', 'top', 'tissue', 'sink', 'towel', 'cap', 'hose', 'restroom', 'close', 'knob', 'open', 'bucket', 'object', 'ring', 'soap']
2022-03-17 10:44:42,718.718 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'white', 'top', 'floor', 'wall', 'seat', 'paper', 'label', 'bowl', 'wooden', 'handle', 'bathroom', 'bottle', 'brush', 'pipe', 'carpet', 'container', 'toilet', 'lid', 'tile']
2022-03-17 10:47:06,158.158 2829:trainer.py:487 do_train_dict(): eta: 2:30:14  iter: 61300  speed: 269.0 images/sec  total_norm: 146.8184 (148.2614)  loss: 133.5750 (135.4241)  masked_loss: 1.4249 (1.4418)  tag_loss: 132.1449 (133.9823)  time: 1.4315 (1.9035)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4265 (1.8984)  save_time: 8.8421 (14.2643)  lr: 0.000008  max mem: 26307
2022-03-17 10:47:06,518.518 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5625
2022-03-17 10:47:06,519.519 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.29115295410156
2022-03-17 10:47:06,519.519 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.80900373055026
2022-03-17 10:47:37,478.478 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023856952786445618
2022-03-17 10:47:37,479.479 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:47:37,479.479 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bear', 'is', '[MASK]', 'behind', 'a', 'chain', '[MASK]', 'fence', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:47:37,494.494 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['fence', 'bear', 'leaf', 'tree', 'ground', 'ear', 'head', 'pole', 'rock', 'plant', 'trunk', 'animal', 'bush', 'black', 'face', 'leg', 'cow', 'eye', '[UNK]', 'snout', 'large', 'mouth', 'log', 'wire', 'nose', 'area', 'branch', 'brown', 'zoo', 'grass', 'post', 'forest', 'weed', 'tag', 'tongue', 'next', 'enclosure', 'car', 'link', 'big', 'walking', 'neck', 'small', 'building', 'top', 'standing', 'window', 'field', 'baby', 'collar']
2022-03-17 10:47:53,388.388 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'face', 'ground', 'mouth', 'plant', 'tree', 'tongue', 'nose', 'bear', 'chain', 'link', 'grass', 'leaf', 'trunk', 'fence', 'cow']
2022-03-17 10:50:16,779.779 2829:trainer.py:487 do_train_dict(): eta: 2:27:19  iter: 61400  speed: 268.6 images/sec  total_norm: 147.6159 (149.4563)  loss: 139.9171 (139.4900)  masked_loss: 1.3465 (1.3837)  tag_loss: 138.0495 (138.1063)  time: 1.4322 (1.9062)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4271 (1.9010)  save_time: 8.8421 (14.2643)  lr: 0.000008  max mem: 26307
2022-03-17 10:50:17,141.141 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-17 10:50:17,141.141 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 131.79714965820312
2022-03-17 10:50:17,141.141 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.81017521183665
2022-03-17 10:50:48,056.056 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023853259161114693
2022-03-17 10:50:48,056.056 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:50:48,056.056 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'dead', '[MASK]', '[MASK]', 'bears', 'on', 'display', 'at', 'a', 'museum', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:50:48,072.072 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'bear', 'ear', 'head', 'sky', 'cloud', 'leg', 'wall', 'paw', 'rock', 'nose', 'ground', 'log', 'trunk', 'wood', 'eye', 'mountain', 'zoo', 'face', 'plant', 'brown', 'branch', 'hole', 'large', 'bush', 'stump', 'grass', 'shadow', 'enclosure', 'stick', 'foot', 'claw', 'bark', 'cliff', 'leaf', 'top', 'forest', 'mouth', 'hill', 'next', 'small', 'floor', 'stone', 'formation', 'arm', 'dirt', 'tail', 'museum', 'animal', 'brick']
2022-03-17 10:51:04,012.012 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'ground', 'rock', 'museum', 'dead', 'brown', 'eye', 'tree', 'wood', 'sky', 'leg', 'nose', 'ear', 'bear', 'display', 'cloud', 'log', 'paw']
2022-03-17 10:53:27,482.482 2829:trainer.py:487 do_train_dict(): eta: 2:24:24  iter: 61500  speed: 268.5 images/sec  total_norm: 147.8613 (150.6930)  loss: 135.0694 (136.9971)  masked_loss: 1.3657 (1.4089)  tag_loss: 133.6291 (135.5882)  time: 1.4313 (1.9071)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4261 (1.9018)  save_time: 8.8421 (14.2643)  lr: 0.000007  max mem: 26307
2022-03-17 10:53:27,845.845 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-17 10:53:27,846.846 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 104.88545227050781
2022-03-17 10:53:27,846.846 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.81883064492956
2022-03-17 10:53:58,773.773 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023857463151216507
2022-03-17 10:53:58,773.773 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:53:58,773.773 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'sleeps', 'on', 'techniques', 'red', 'carpet', 'with', 'tennis', 'shoes', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:53:58,789.789 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cat', 'head', 'ear', '[UNK]', 'shoe', 'carpet', 'fur', 'nose', 'paw', 'floor', 'wall', 'leg', 'red', 'white', 'cord', 'face', 'couch', 'eye', 'back', 'tail', 'body', 'person', 'rug', 'hand', 'blanket', 'spot', 'top', 'next', 'toy', 'gray', 'wire', 'string', 'pair', 'black', 'blue', 'grey', 'stripe', 'kitten', 'pink', 'sock', 'small', 'logo', 'foot', 'sleeping', 'light', 'close', 'chair', 'bag', 'playing', 'ball']
2022-03-17 10:54:14,765.765 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'red', 'wall', 'nose', 'ear', 'cat', 'tennis', 'fur', 'carpet', 'shoe', 'paw', 'sleeps']
2022-03-17 10:56:38,056.056 2829:trainer.py:487 do_train_dict(): eta: 2:21:29  iter: 61600  speed: 268.7 images/sec  total_norm: 147.9398 (152.3039)  loss: 140.0132 (140.6528)  masked_loss: 1.3359 (1.3725)  tag_loss: 138.6502 (139.2804)  time: 1.4314 (1.9057)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4261 (1.9006)  save_time: 8.8421 (14.2643)  lr: 0.000007  max mem: 26307
2022-03-17 10:56:38,417.417 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-17 10:56:38,418.418 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 156.3429718017578
2022-03-17 10:56:38,418.418 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.81930265627572
2022-03-17 10:57:09,554.554 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023935874924063683
2022-03-17 10:57:09,555.555 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 10:57:09,555.555 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'in', 'the', '[MASK]', ',', 'a', 'lady', 'getting', 'something', 'from', 'the', '[MASK]', 'while', 'a', 'man', 'is', 'putting', 'something', 'in', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 10:57:09,570.570 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'kitchen', 'hair', '[UNK]', 'cabinet', 'glasses', 'refrigerator', 'man', 'woman', 'bottle', 'window', 'hand', 'door', 'handle', 'bowl', 'food', 'head', 'floor', 'pot', 'wall', 'drawer', 'pan', 'table', 'arm', 'person', 'jean', 'towel', 'can', 'paper', 'box', 'face', 'bag', 'knife', 'napkin', 'apron', 'lid', 'lady', 'stove', 'girl', 'container', 'ear', 'short', 'jug', 'board', 'plate', 'light', 'cup', 'ceiling', 'shoe', 'shelf']
2022-03-17 10:57:25,617.617 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'can', 'man', 'face', 'something', 'door', 'light', 'woman', 'hair', 'floor', 'table', 'wall', 'food', 'lady', 'window', 'box', 'jean', 'shirt', 'kitchen', 'dress', 'bowl', 'handle', 'cabinet', 'bottle', 'ceiling', 'pan', 'glasses', 'cloth', 'pot', 'towel', 'trash', 'lid', 'stove', 'oven', 'refrigerator', 'jug']
2022-03-17 10:59:48,861.861 2829:trainer.py:487 do_train_dict(): eta: 2:18:34  iter: 61700  speed: 268.3 images/sec  total_norm: 148.6898 (151.4357)  loss: 137.3496 (139.4405)  masked_loss: 1.3560 (1.3984)  tag_loss: 135.9817 (138.0421)  time: 1.4312 (1.9080)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4261 (1.9029)  save_time: 8.8421 (14.2643)  lr: 0.000007  max mem: 26307
2022-03-17 10:59:49,222.222 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.8055555820465088
2022-03-17 10:59:49,223.223 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 170.92286682128906
2022-03-17 10:59:49,223.223 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.82107164327381
2022-03-17 11:00:20,436.436 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023950770497322083
2022-03-17 11:00:20,436.436 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:00:20,436.436 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'there', 'is', 'a', 'dog', 'sitting', '[MASK]', 'the', 'cart', 'of', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:00:20,451.451 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['car', 'building', 'sky', 'window', 'tire', 'wheel', 'sign', 'bike', 'shadow', 'motorcycle', 'light', 'seat', 'door', 'pole', 'handle', 'windshield', 'road', 'street', 'ground', 'line', '[UNK]', 'roof', 'traffic', 'mirror', 'helmet', 'flag', 'mountain', 'truck', 'person', 'logo', 'tree', 'gas', 'man', 'background', 'white', 'cone', 'arrow', 'vehicle', 'shirt', 'side', 'bicycle', 'billboard', 'lot', 'parking', 'parked', 'next', 'fence', 'silver', 'sidewalk', 'fender']
2022-03-17 11:00:36,412.412 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'line', 'building', 'door', 'road', 'street', 'light', 'car', 'window', 'sign', 'sky', 'dog', 'vehicle', 'handle', 'shadow', 'flag', 'wheel', 'tail', 'pole', 'bike', 'motorcycle', 'helmet', 'cart', 'tire', 'harness', 'windshield', 'paw']
2022-03-17 11:02:59,749.749 2829:trainer.py:487 do_train_dict(): eta: 2:15:38  iter: 61800  speed: 268.2 images/sec  total_norm: 148.8605 (152.0960)  loss: 138.7571 (139.6871)  masked_loss: 1.4104 (1.4105)  tag_loss: 137.4981 (138.2766)  time: 1.4327 (1.9089)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4277 (1.9037)  save_time: 8.8421 (14.2643)  lr: 0.000007  max mem: 26307
2022-03-17 11:03:00,109.109 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6944444179534912
2022-03-17 11:03:00,110.110 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 115.57704162597656
2022-03-17 11:03:00,110.110 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.82539304200205
2022-03-17 11:03:30,800.800 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02395489253103733
2022-03-17 11:03:30,801.801 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:03:30,801.801 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'an', 'instructional', 'sign', 'is', 'placed', '[MASK]', 'a', 'fence', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:03:30,817.817 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['letter', 'sign', 'pole', 'fence', 'grass', 'ground', 'post', 'sky', 'building', 'tree', 'mountain', 'red', 'cloud', 'stop', 'wire', '[UNK]', 'parking', 'road', 'roof', 'bolt', 'hill', 'bush', 'number', 'arrow', 'car', 'water', 'lot', 'line', 'circle', 'chain', 'sand', 'area', 'lettering', 'dirt', 'screw', 'field', 'flower', 'word', 'wood', 'white', 'window', 'rock', 'close', 'box', 'truck', 'wall', 'paint', 'power', 'side', 'top']
2022-03-17 11:03:46,737.737 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['building', 'ground', 'post', 'mountain', 'letter', 'sign', 'sky', 'boat', 'roof', 'grass', 'cloud', 'pole', 'dirt', 'wire', 'fence', 'instructional']
2022-03-17 11:06:10,506.506 2829:trainer.py:487 do_train_dict(): eta: 2:12:43  iter: 61900  speed: 268.4 images/sec  total_norm: 149.0126 (153.1112)  loss: 137.2758 (138.4624)  masked_loss: 1.3879 (1.4212)  tag_loss: 135.9042 (137.0412)  time: 1.4311 (1.9076)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4261 (1.9025)  save_time: 8.8421 (14.2643)  lr: 0.000007  max mem: 26307
2022-03-17 11:06:10,872.872 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-17 11:06:10,872.872 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 153.95523071289062
2022-03-17 11:06:10,872.872 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.8274375361781
2022-03-17 11:06:41,899.899 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023952605202794075
2022-03-17 11:06:41,900.900 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:06:41,900.900 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'couple', 'looking', 'into', 'each', '[MASK]', 'eyes', 'on', '[MASK]', 'bench', 'in', 'a', 'grassy', 'field', '.', '##ize', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:06:41,916.916 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'shirt', 'bench', 'hand', 'tree', 'man', 'leg', 'grass', 'woman', 'head', 'person', '[UNK]', 'girl', 'jean', 'park', 'couple', 'arm', 'short', 'face', 'baseball', 'bird', 'boy', 'watch', 'foot', 'bracelet', 'sweater', 'bush', 'shoe', 'ground', 'photo', 'top', 'seat', 'ball', 'young', 'tank', 'plant', 'back', 'necklace', 'other', 'post', 'group', 'dress', 'pole', 'glasses', 'bat', 'flower', 'wooden', 'nose', 'field', 'cup']
2022-03-17 11:06:57,854.854 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'park', 'woman', 'field', 'hair', 'girl', 'person', 'seat', 'couple', 'tree', 'baseball', 'jean', 'shirt', 'leg', 'grass', 'bench', 'bracelet', 'grassy']
2022-03-17 11:09:21,580.580 2829:trainer.py:487 do_train_dict(): eta: 2:09:48  iter: 62000  speed: 268.0 images/sec  total_norm: 148.6911 (151.4949)  loss: 135.5490 (137.3901)  masked_loss: 1.3895 (1.4084)  tag_loss: 134.0743 (135.9817)  time: 1.4311 (1.9108)  data: 0.0001 (0.0005)  to_device: 0.0052 (0.0051)  time_gpu: 1.4257 (1.9052)  save_time: 8.8421 (14.2643)  lr: 0.000007  max mem: 26307
2022-03-17 11:09:21,942.942 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5625
2022-03-17 11:09:21,942.942 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 167.65245056152344
2022-03-17 11:09:21,943.943 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.82623415715068
2022-03-17 11:09:52,935.935 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02395820990204811
2022-03-17 11:09:52,936.936 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:09:52,936.936 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'black', 'and', 'white', 'cat', '[MASK]', 'behind', 'a', 'screen', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:09:52,952.952 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['eye', 'nose', 'head', 'cat', 'mouth', 'face', '[UNK]', 'ear', 'dog', 'leg', 'animal', 'wall', 'white', 'man', 'arm', 'hand', 'person', 'black', 'fence', 'screen', 'hair', 'foot', 'light', 'collar', 'paw', 'picture', 'reflection', 'shadow', 'body', 'neck', 'tongue', 'shirt', 'tie', 'dress', 'something', 'mesh', 'stripe', 'bow', 'cage', 'back', 'image', 'dark', 'photo', 'woman', 'front', 'camera', 'tile', 'tail', 'fur', 'floor']
2022-03-17 11:10:08,863.863 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'black', 'white', 'mouth', 'wall', 'eye', 'screen', 'dog', 'animal', 'leg', 'nose', 'ear', 'cat']
2022-03-17 11:12:32,492.492 2829:trainer.py:487 do_train_dict(): eta: 2:06:52  iter: 62100  speed: 268.2 images/sec  total_norm: 148.8828 (150.4552)  loss: 136.4514 (138.0297)  masked_loss: 1.3424 (1.3971)  tag_loss: 135.1617 (136.6326)  time: 1.4307 (1.9091)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4255 (1.9039)  save_time: 8.8421 (14.2643)  lr: 0.000006  max mem: 26307
2022-03-17 11:12:32,852.852 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5151515007019043
2022-03-17 11:12:32,852.852 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 149.12570190429688
2022-03-17 11:12:32,852.852 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.82594158871764
2022-03-17 11:13:03,785.785 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02396056056022644
2022-03-17 11:13:03,786.786 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:13:03,786.786 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'girls', 'legs', 'wearing', 'a', 'pair', 'of', '[MASK]', 'shoes', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:13:03,801.801 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shoe', 'leg', 'brick', 'ground', 'person', 'sock', 'bench', '[UNK]', 'short', 'shadow', 'foot', 'arm', 'sidewalk', 'woman', 'design', 'man', 'top', 'white', 'wheel', 'jean', 'wall', 'shirt', 'heel', 'red', 'boy', 'bag', 'trash', 'bolt', 'black', 'logo', 'knee', 'front', 'base', 'skirt', 'next', 'head', 'stripe', 'back', 'street', 'hand', 'pair', 'wooden', 'flower', 'jacket', 'handle', 'can', 'dress', 'chair', 'stone', 'label']
2022-03-17 11:13:19,773.773 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'little', 'red', 'ground', 'person', 'arm', 'girls', 'jean', 'pair', 'leg', 'bag', 'brick', 'bench', 'shoe', 'windshield', 'sock']
03-17 11:13:31.474 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 11:13:31.475 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 11:13:32.453 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 11:15:43,632.632 2829:trainer.py:487 do_train_dict(): eta: 2:03:57  iter: 62200  speed: 267.9 images/sec  total_norm: 147.6249 (149.1771)  loss: 136.6075 (135.7381)  masked_loss: 1.3502 (1.3613)  tag_loss: 135.5584 (134.3768)  time: 1.4313 (1.9115)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4263 (1.9064)  save_time: 8.8421 (14.2643)  lr: 0.000006  max mem: 26307
2022-03-17 11:15:43,993.993 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-17 11:15:43,993.993 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 129.79275512695312
2022-03-17 11:15:43,993.993 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.8344291760489
2022-03-17 11:16:14,910.910 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023973818868398666
2022-03-17 11:16:14,910.910 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:16:14,911.911 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'some', 'horses', 'standing', 'by', 'a', 'trailer', 'in', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:16:14,926.926 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['snow', 'tree', 'tail', 'leg', 'horse', 'bus', 'ground', 'track', 'window', 'sky', 'tire', 'shadow', 'head', 'roof', '[UNK]', 'car', 'hat', 'person', 'door', 'cart', 'building', 'man', 'vehicle', 'blanket', 'harness', 'train', 'saddle', 'face', 'wood', 'snowy', 'paw', 'sign', 'covered', 'shirt', 'wheel', 'next', 'ear', 'trailer', 'truck', 'pole', 'light', 'mane', 'cloud', 'road', 'hair', 'jean', 'stick', 'leash', 'post', 'bench']
2022-03-17 11:16:30,767.767 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'number', 'door', 'car', 'ground', 'track', 'window', 'tree', 'wood', 'horse', 'sky', 'shirt', 'bus', 'leg', 'roof', 'snow', 'tail', 'hat', 'pole', 'trailer', 'tire', 'harness']
2022-03-17 11:18:54,664.664 2829:trainer.py:487 do_train_dict(): eta: 2:01:01  iter: 62300  speed: 268.0 images/sec  total_norm: 149.0175 (150.6475)  loss: 138.3447 (140.0259)  masked_loss: 1.3720 (1.3775)  tag_loss: 137.0161 (138.6484)  time: 1.4313 (1.9103)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4264 (1.9051)  save_time: 8.8421 (14.2643)  lr: 0.000006  max mem: 26307
2022-03-17 11:18:55,024.024 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6363636255264282
2022-03-17 11:18:55,025.025 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 100.81631469726562
2022-03-17 11:18:55,025.025 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.84540224686647
2022-03-17 11:19:26,342.342 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.023985836654901505
2022-03-17 11:19:26,343.343 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:19:26,343.343 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'is', '[MASK]', 'a', 'snow', '##board', 'wince', 'a', 'hill', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:19:26,359.359 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['snow', 'sky', '[UNK]', 'person', 'man', 'jacket', 'mountain', 'ground', 'ski', 'leg', 'hill', 'pole', 'hat', 'track', 'backpack', 'arm', 'shadow', 'shirt', 'helmet', 'rock', 'coat', 'head', 'skier', 'group', 'tree', 'cloud', 'board', 'glove', 'mound', 'top', 'snowy', 'ramp', 'pile', 'hand', 'slope', 'foot', 'hair', 'sign', 'boot', 'face', 'logo', 'fence', 'building', 'flag', 'line', 'couple', 'bunch', 'air', 'bag', 'camera']
2022-03-17 11:19:42,346.346 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'ground', 'rock', 'board', 'person', 'arm', 'hill', 'mountain', 'sky', 'leg', 'clothes', 'snow', 'coat', 'pole', 'jacket', 'ski', 'helmet']
2022-03-17 11:22:05,883.883 2829:trainer.py:487 do_train_dict(): eta: 1:58:06  iter: 62400  speed: 267.8 images/sec  total_norm: 146.9800 (150.2008)  loss: 135.2062 (137.2163)  masked_loss: 1.4445 (1.4491)  tag_loss: 133.9010 (135.7672)  time: 1.4319 (1.9122)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4266 (1.9070)  save_time: 8.8421 (14.2643)  lr: 0.000006  max mem: 26307
2022-03-17 11:22:06,248.248 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-17 11:22:06,248.248 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 170.20883178710938
2022-03-17 11:22:06,249.249 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.84179468383789
2022-03-17 11:22:37,294.294 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02401658520102501
2022-03-17 11:22:37,294.294 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:22:37,295.295 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'cannons', 'bunch', 'of', 'people', '[MASK]', 'with', 'a', 'ball', 'on', 'a', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:22:37,310.310 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['car', 'grass', 'shirt', 'man', 'tree', 'short', 'hand', '[UNK]', 'boy', 'building', 'stripe', 'hat', 'person', 'hair', 'lot', 'parking', 'van', 'ground', 'field', 'shoe', 'shadow', 'cap', 'suv', 'park', 'arm', 'window', 'roof', 'house', 'watch', 'sky', 'sock', 'group', 'woman', 'head', 'fence', 'truck', 'leg', 'game', 'vehicle', 'cone', 'sunglasses', 'uniform', 'air', 'girl', 'young', 'foot', 'tire', 'line', 'jersey', 'bush']
2022-03-17 11:22:53,263.263 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'building', 'park', 'short', 'car', 'field', 'ground', 'hair', 'girl', 'person', 'lot', 'arm', 'boy', 'foot', 'window', 'tree', 'watch', 'ball', 'jean', 'shirt', 'vehicle', 'shadow', 'grass', 'parking', 'hat', 'cap', 'jacket', 'fence', 'bunch', 'shoe', 'suv', 'stripe']
2022-03-17 11:25:17,117.117 2829:trainer.py:487 do_train_dict(): eta: 1:55:10  iter: 62500  speed: 267.7 images/sec  total_norm: 149.1545 (153.3212)  loss: 138.8012 (139.8543)  masked_loss: 1.4600 (1.4645)  tag_loss: 137.1537 (138.3898)  time: 1.4312 (1.9124)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4263 (1.9073)  save_time: 8.8421 (14.2643)  lr: 0.000006  max mem: 26307
2022-03-17 11:25:17,478.478 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-17 11:25:17,479.479 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 153.95199584960938
2022-03-17 11:25:17,479.479 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.8436902140657
2022-03-17 11:25:48,950.950 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024019574746489525
2022-03-17 11:25:48,950.950 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:25:48,950.950 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'people', 'wearing', 'hats', 'that', 'double', 'as', '[MASK]', '##s', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:25:48,966.966 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'glasses', 'woman', 'face', 'umbrella', 'hair', 'head', 'wall', '[UNK]', 'table', 'hand', 'design', 'smile', 'person', 'nose', 'eye', 'door', 'man', 'picture', 'chair', 'window', 'cup', 'glass', 'bowl', 'arm', 'button', 'lady', 'boy', 'hat', 'food', 'plate', 'cabinet', 'couple', 'paper', 'bottle', 'girl', 'jacket', 'kitchen', 'box', 'shelf', 'rack', 'light', 'knife', 'handle', 'floor', 'can', 'pot', 'watch', 'book', 'bracelet']
2022-03-17 11:26:04,919.919 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'door', 'woman', 'cup', 'hair', 'design', 'person', 'table', 'wall', 'smile', 'glass', 'eye', 'shirt', 'nose', 'bowl', 'cabinet', 'hat', 'pan', 'glasses', 'pot', 'shelf', 'container', 'lid', 'umbrella', 'stove', 'knob', 'microwave']
2022-03-17 11:28:28,433.433 2829:trainer.py:487 do_train_dict(): eta: 1:52:15  iter: 62600  speed: 267.6 images/sec  total_norm: 148.2464 (151.1456)  loss: 139.1720 (138.5533)  masked_loss: 1.3828 (1.4162)  tag_loss: 137.7024 (137.1371)  time: 1.4314 (1.9131)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4263 (1.9080)  save_time: 8.8421 (14.2643)  lr: 0.000006  max mem: 26307
2022-03-17 11:28:28,793.793 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6969696879386902
2022-03-17 11:28:28,793.793 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.42344665527344
2022-03-17 11:28:28,793.793 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.8429318234871
2022-03-17 11:29:00,245.245 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024010566994547844
2022-03-17 11:29:00,246.246 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:29:00,246.246 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '##ammed', 'black', 'bear', 'resting', 'in', 'a', 'large', 'ham', '##mo', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:29:00,261.261 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bear', 'chain', 'nose', 'head', 'snout', 'eye', 'ear', 'paw', 'face', 'tree', 'bag', 'black', 'brown', 'bolt', 'animal', 'muzzle', 'mouth', 'large', 'trunk', 'rope', 'buckle', '[UNK]', 'next', 'bucket', 'building', 'leg', 'ground', 'claw', 'wall', 'foot', 'gear', 'front', 'strap', 'purse', 'pocket', 'wooden', 'horse', 'basket', 'elephant', 'rock', 'leather', 'log', 'grass', 'man', 'teddy', 'dog', 'barrel', 'screw', 'structure', 'bush']
2022-03-17 11:29:16,231.231 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'face', 'black', 'building', 'large', 'eye', 'baby', 'tree', 'dog', 'nose', 'bag', 'ear', 'bear', 'chain', 'brick', 'gear', 'resting', 'barrel', 'bolt', 'snout', 'paw']
2022-03-17 11:31:39,446.446 2829:trainer.py:487 do_train_dict(): eta: 1:49:19  iter: 62700  speed: 268.0 images/sec  total_norm: 147.6171 (149.7025)  loss: 137.2416 (139.2092)  masked_loss: 1.4862 (1.4644)  tag_loss: 136.0555 (137.7448)  time: 1.4295 (1.9102)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4243 (1.9050)  save_time: 8.8421 (14.2643)  lr: 0.000006  max mem: 26307
2022-03-17 11:31:39,807.807 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5428571701049805
2022-03-17 11:31:39,807.807 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 161.46360778808594
2022-03-17 11:31:39,807.807 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.84104612678479
2022-03-17 11:32:11,191.191 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02400583028793335
2022-03-17 11:32:11,191.191 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:32:11,191.191 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bathroom', 'is', 'shown', 'with', 'a', 'stainless', 'steel', 'shelf', ',', '[MASK]', 'and', 'wall', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:32:11,207.207 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'toilet', 'shelf', 'bathroom', 'paper', 'phone', 'bottle', 'lid', 'telephone', 'holder', 'shower', 'floor', 'cord', 'tile', 'door', 'knob', 'head', 'seat', '[UNK]', 'can', 'mirror', 'cabinet', 'button', 'soap', 'handle', 'roll', 'hose', 'control', 'drain', 'room', 'tank', 'light', 'brush', 'reflection', 'cap', 'small', 'dish', 'white', 'tub', 'towel', 'hair', 'bar', 'outlet', 'hand', 'sink', 'box', 'cup', 'bowl', 'rack', 'vent']
2022-03-17 11:32:27,150.150 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'top', 'light', 'floor', 'wall', 'phone', 'paper', 'steel', 'cabinet', 'bathroom', 'bottle', 'shower', 'telephone', 'brush', 'holder', 'towel', 'basket', 'shelf', 'cord', 'toilet', 'lid', 'knob', 'stainless']
2022-03-17 11:34:50,775.775 2829:trainer.py:487 do_train_dict(): eta: 1:46:23  iter: 62800  speed: 267.6 images/sec  total_norm: 150.2934 (153.1475)  loss: 138.7876 (138.4844)  masked_loss: 1.3304 (1.3679)  tag_loss: 137.0730 (137.1164)  time: 1.4316 (1.9133)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4266 (1.9081)  save_time: 8.8421 (14.2643)  lr: 0.000005  max mem: 26307
2022-03-17 11:34:51,135.135 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-17 11:34:51,136.136 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 126.8299560546875
2022-03-17 11:34:51,136.136 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.84651027006261
2022-03-17 11:35:22,638.638 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024001579731702805
2022-03-17 11:35:22,638.638 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:35:22,639.639 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'red', 'motorcycle', '[MASK]', '[MASK]', 'a', 'large', 'warehouse', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:35:22,654.654 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['building', 'motorcycle', 'tire', 'wheel', 'bike', 'window', '[UNK]', 'light', 'road', 'street', 'sky', 'garage', 'door', 'sign', 'engine', 'sidewalk', 'line', 'pole', 'pipe', 'helmet', 'seat', 'tree', 'fender', 'car', 'wall', 'spoke', 'mirror', 'ground', 'man', 'curb', 'red', 'gas', 'front', 'tank', 'shadow', 'cloud', 'flag', 'black', 'jacket', 'next', 'city', 'exhaust', 'person', 'letter', 'can', 'trash', 'grass', 'rim', 'pillar', 'jean']
2022-03-17 11:35:38,605.605 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'line', 'building', 'large', 'door', 'road', 'street', 'red', 'light', 'car', 'seat', 'engine', 'window', 'tree', 'sky', 'tank', 'wheel', 'mirror', 'cloud', 'garage', 'bike', 'pipe', 'motorcycle', 'warehouse', 'tire', 'exhaust', 'fender']
2022-03-17 11:38:02,168.168 2829:trainer.py:487 do_train_dict(): eta: 1:43:28  iter: 62900  speed: 267.5 images/sec  total_norm: 146.7174 (149.8026)  loss: 142.2374 (142.9025)  masked_loss: 1.4138 (1.4346)  tag_loss: 140.8938 (141.4679)  time: 1.4311 (1.9139)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4259 (1.9088)  save_time: 8.8421 (14.2643)  lr: 0.000005  max mem: 26307
2022-03-17 11:38:02,529.529 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.47058823704719543
2022-03-17 11:38:02,529.529 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 128.69232177734375
2022-03-17 11:38:02,530.530 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.85154658120776
2022-03-17 11:38:34,389.389 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02400333806872368
2022-03-17 11:38:34,390.390 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:38:34,390.390 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bunch', 'of', 'elephants', 'are', 'being', '[MASK]', '##ed', 'down', 'a', '[MASK]', 'street', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:38:34,405.405 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'person', 'sign', 'sky', 'building', 'sidewalk', 'cloud', 'elephant', 'man', 'jacket', 'street', 'shirt', '[UNK]', 'trunk', 'pole', 'road', 'coat', 'wall', 'bag', 'city', 'fire', 'car', 'jean', 'woman', 'tail', 'hair', 'branch', 'group', 'ear', 'line', 'bridge', 'pig', 'shoe', 'animal', 'child', 'dirt', 'boy', 'window', 'stand', 'sheep', 'brick', 'curb', 'head', 'truck', 'hat', 'sack', 'can', 'block', 'shadow', 'ground']
2022-03-17 11:38:50,402.402 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'man', 'line', 'building', 'road', 'street', 'short', 'car', 'fire', 'ground', 'hair', 'person', 'wall', 'tree', 'ball', 'sign', 'sky', 'block', 'shirt', 'truck', 'suit', 'coat', 'cloud', 'pole', 'jacket', 'bunch', 'elephant', 'sidewalk']
2022-03-17 11:41:13,637.637 2829:trainer.py:487 do_train_dict(): eta: 1:40:32  iter: 63000  speed: 267.4 images/sec  total_norm: 148.2000 (151.0592)  loss: 137.4753 (140.3135)  masked_loss: 1.3761 (1.3950)  tag_loss: 135.7175 (138.9185)  time: 1.4311 (1.9146)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4261 (1.9095)  save_time: 8.8421 (14.2643)  lr: 0.000005  max mem: 26307
2022-03-17 11:41:14,000.000 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.8235294222831726
2022-03-17 11:41:14,000.000 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 124.79339599609375
2022-03-17 11:41:14,000.000 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.8502342538486
2022-03-17 11:41:45,453.453 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024008918553590775
2022-03-17 11:41:45,453.453 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:41:45,454.454 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'in', 'a', 'blue', 'shirt', 'and', 'apron', 'stands', 'near', 'a', 'counter', 'that', 'has', 'food', 'stacked', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:41:45,469.469 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['man', 'hair', 'oven', 'head', 'wall', 'ear', 'shirt', 'hand', 'food', 'arm', 'grill', 'nose', '[UNK]', 'handle', 'plate', 'table', 'fire', 'pipe', 'pizza', 'face', 'wood', 'kitchen', 'stove', 'container', 'box', 'cord', 'lid', 'jean', 'tool', 'bucket', 'knife', 'pan', 'beard', 'belt', 'light', 'tray', 'stick', 'apron', 'ground', 'hose', 'door', 'dough', 'rack', 'bracelet', 'shelf', 'top', 'floor', 'something', 'bowl', 'fireplace']
2022-03-17 11:42:01,430.430 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'hair', 'blue', 'table', 'wall', 'food', 'chair', 'bar', 'wood', 'shirt', 'nose', 'ear', 'bowl', 'counter', 'handle', 'knife', 'pipe', 'pizza', 'cord', 'rack', 'oven', 'grill']
03-17 11:43:32.526 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 11:43:32.526 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 11:43:33.805 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 11:44:25,037.037 2829:trainer.py:487 do_train_dict(): eta: 1:37:36  iter: 63100  speed: 267.5 images/sec  total_norm: 148.0542 (150.9405)  loss: 140.1829 (139.8030)  masked_loss: 1.3559 (1.4068)  tag_loss: 138.3090 (138.3962)  time: 1.4317 (1.9141)  data: 0.0001 (0.0005)  to_device: 0.0051 (0.0050)  time_gpu: 1.4265 (1.9086)  save_time: 8.8421 (14.2643)  lr: 0.000005  max mem: 26307
2022-03-17 11:44:25,398.398 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-17 11:44:25,398.398 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 130.31765747070312
2022-03-17 11:44:25,399.399 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.85510305211514
2022-03-17 11:44:57,218.218 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02400299161672592
2022-03-17 11:44:57,219.219 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:44:57,219.219 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'view', 'looking', 'out', '[MASK]', 'two', 'adjacent', '[MASK]', 'windows', 'of', 'two', 'airplanes', '[MASK]', 'pavement', 'with', 'yellow', 'lines', 'and', 'gray', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:44:57,235.235 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tail', 'sky', 'line', 'airplane', 'wing', 'window', 'cloud', 'runway', 'ground', 'engine', 'door', 'airport', '[UNK]', 'tree', 'nose', 'wheel', 'mirror', 'building', 'plane', 'road', 'logo', 'mountain', 'windshield', 'cockpit', 'vehicle', 'cone', 'grass', 'jet', 'pole', 'large', 'letter', 'stair', 'white', 'front', 'gate', 'propeller', 'person', 'stripe', 'tire', 'fuselage', 'sign', 'commercial', 'man', 'car', 'way', 'tower', 'name', 'truck', 'blue', 'cart']
2022-03-17 11:45:13,186.186 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'line', 'door', 'road', 'ground', 'window', 'wing', 'tree', 'sky', 'yellow', 'nose', 'wheel', 'adjacent', 'tail', 'runway', 'airplane', 'pavement']
2022-03-17 11:47:36,372.372 2829:trainer.py:487 do_train_dict(): eta: 1:34:40  iter: 63200  speed: 267.6 images/sec  total_norm: 147.0079 (149.7055)  loss: 133.6707 (134.7307)  masked_loss: 1.4062 (1.3912)  tag_loss: 132.0955 (133.3394)  time: 1.4305 (1.9133)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0051)  time_gpu: 1.4250 (1.9081)  save_time: 8.8421 (14.2643)  lr: 0.000005  max mem: 26307
2022-03-17 11:47:36,734.734 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-17 11:47:36,734.734 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.88140869140625
2022-03-17 11:47:36,734.734 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.85977142117035
2022-03-17 11:48:08,315.315 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024018915370106697
2022-03-17 11:48:08,316.316 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:48:08,316.316 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'tennis', 'player', 'stands', 'awaiting', 'the', '[MASK]', 'expect', '##antly', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:48:08,331.331 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shoe', '[UNK]', 'court', 'woman', 'tennis', 'short', 'sock', 'chair', 'shirt', 'wall', 'hair', 'leg', 'hand', 'tank', 'skirt', 'ground', 'top', 'person', 'player', 'man', 'ball', 'head', 'band', 'logo', 'boy', 'letter', 'arm', 'hat', 'watch', 'handle', 'cap', 'ponytail', 'wrist', 'ear', 'girl', 'line', 'fence', 'bracelet', 'dirt', 'shadow', 'sign', 'dress', 'flower', 'sunglasses', 'glasses', 'banner', 'tree', 'uniform', 'jacket', 'outfit']
2022-03-17 11:48:24,231.231 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'top', 'player', 'woman', 'court', 'short', 'ground', 'hair', 'person', 'wall', 'chair', 'tree', 'ball', 'letter', 'shirt', 'dog', 'leg', 'crown', 'tank', 'tennis', 'hat', 'wrist', 'banner', 'skirt', 'shoe', 'ponytail', 'sock']
2022-03-17 11:50:48,035.035 2829:trainer.py:487 do_train_dict(): eta: 1:31:44  iter: 63300  speed: 267.1 images/sec  total_norm: 148.8337 (151.8663)  loss: 137.5023 (139.1452)  masked_loss: 1.3838 (1.3903)  tag_loss: 136.3300 (137.7550)  time: 1.4323 (1.9166)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4271 (1.9114)  save_time: 8.8421 (14.2643)  lr: 0.000005  max mem: 26307
2022-03-17 11:50:48,396.396 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-17 11:50:48,396.396 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 129.29562377929688
2022-03-17 11:50:48,397.397 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.86152212702514
2022-03-17 11:51:20,261.261 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.0240024384111166
2022-03-17 11:51:20,261.261 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:51:20,262.262 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'big', 'boat', 'is', 'doing', 'down', 'the', '[MASK]', 'carrying', 'passengers', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:51:20,277.277 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['water', 'building', 'window', 'tree', 'boat', 'sky', 'fence', 'bridge', 'reflection', 'door', 'person', 'wall', 'light', 'river', 'stair', 'sign', 'roof', 'post', 'dock', '[UNK]', 'red', 'man', 'canal', 'bottom', 'railing', 'lamp', 'front', 'step', 'sidewalk', 'wake', 'ripple', 'city', 'grass', 'bush', 'engine', 'flag', 'cover', 'car', 'large', 'canopy', 'shirt', 'flower', 'writing', 'clock', 'tower', 'umbrella', 'arch', 'white', 'balcony', 'entrance']
2022-03-17 11:51:36,256.256 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['man', 'water', 'building', 'river', 'door', 'light', 'big', 'hair', 'post', 'person', 'wall', 'bridge', 'cover', 'window', 'tree', 'sign', 'sky', 'shirt', 'dog', 'boat', 'wake', 'fence', 'reflection', 'lamp']
2022-03-17 11:53:59,851.851 2829:trainer.py:487 do_train_dict(): eta: 1:28:48  iter: 63400  speed: 266.9 images/sec  total_norm: 147.7166 (152.0751)  loss: 139.4232 (139.4124)  masked_loss: 1.4781 (1.4925)  tag_loss: 137.9693 (137.9198)  time: 1.4330 (1.9181)  data: 0.0001 (0.0002)  to_device: 0.0049 (0.0048)  time_gpu: 1.4281 (1.9131)  save_time: 8.8421 (14.2643)  lr: 0.000005  max mem: 26307
2022-03-17 11:54:00,211.211 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.800000011920929
2022-03-17 11:54:00,212.212 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 138.64056396484375
2022-03-17 11:54:00,212.212 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.86758499145508
2022-03-17 11:54:32,134.134 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024002449586987495
2022-03-17 11:54:32,134.134 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:54:32,135.135 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'staring', 'at', 'a', 'television', 'screen', 'with', 'geese', 'on', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:54:32,150.150 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cat', 'television', 'bird', 'logo', 'shelf', 'wall', 'picture', 'mountain', 'screen', 'wing', 'box', 'sky', 'ear', 'grass', 'field', 'head', 'beak', 'table', 'tail', 'stand', '[UNK]', 'frame', 'book', 'duck', 'speaker', 'tv', 'bottle', 'water', 'hill', 'cloud', 'cord', 'room', 'animal', 'flat', 'wire', 'vase', 'bat', 'baseball', 'ground', 'dog', 'penguin', 'candle', 'base', 'floor', 'fireplace', 'man', 'player', 'paper', 'painting', 'airplane']
2022-03-17 11:54:48,146.146 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'field', 'television', 'wall', 'hill', 'mountain', 'wing', 'box', 'sky', 'picture', 'screen', 'dog', 'ear', 'staring', 'bird', 'frame', 'cat', 'grass', 'tail', 'bottle', 'speaker', 'wire', 'logo', 'duck', 'shelf', 'beak', 'geese']
2022-03-17 11:57:11,572.572 2829:trainer.py:487 do_train_dict(): eta: 1:25:52  iter: 63500  speed: 267.1 images/sec  total_norm: 147.7207 (149.5208)  loss: 137.8922 (138.6856)  masked_loss: 1.2975 (1.3355)  tag_loss: 136.5096 (137.3501)  time: 1.4328 (1.9172)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4277 (1.9121)  save_time: 8.8421 (14.2643)  lr: 0.000004  max mem: 26307
2022-03-17 11:57:11,932.932 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7142857313156128
2022-03-17 11:57:11,932.932 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 120.93940734863281
2022-03-17 11:57:11,932.932 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.87128718993948
2022-03-17 11:57:43,686.686 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024005835875868797
2022-03-17 11:57:43,686.686 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 11:57:43,687.687 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'sheep', 'dog', 'rounding', 'up', 'sheep', 'as', 'on', '[MASK]', '##oke', '##rs', 'watch', 'jul', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 11:57:43,702.702 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['person', 'tree', 'sheep', 'shirt', 'man', 'hat', 'ground', 'woman', 'road', 'fence', 'building', 'head', 'gravel', '[UNK]', 'leg', 'sunglasses', 'boy', 'group', 'grass', 'bench', 'trunk', 'girl', 'dog', 'hand', 'roof', 'house', 'shoe', 'jacket', 'wall', 'jean', 'bush', 'hair', 'cap', 'child', 'wool', 'skirt', 'window', 'glasses', 'leaf', 'flower', 'shadow', 'table', 'animal', 'front', 'pen', 'car', 'vest', 'coat', 'sign', 'herd']
2022-03-17 11:57:59,678.678 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'building', 'road', 'woman', 'ground', 'person', 'child', 'boy', 'tree', 'jean', 'shirt', 'dog', 'roof', 'tail', 'hat', 'cap', 'sheep', 'fence', 'sunglasses']
2022-03-17 12:00:23,254.254 2829:trainer.py:487 do_train_dict(): eta: 1:22:56  iter: 63600  speed: 267.1 images/sec  total_norm: 149.1510 (153.5733)  loss: 138.9556 (140.8472)  masked_loss: 1.3446 (1.4042)  tag_loss: 137.6695 (139.4431)  time: 1.4320 (1.9168)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4271 (1.9117)  save_time: 8.8421 (14.2643)  lr: 0.000004  max mem: 26307
2022-03-17 12:00:23,616.616 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6388888955116272
2022-03-17 12:00:23,616.616 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 139.521728515625
2022-03-17 12:00:23,616.616 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.8750135879876
2022-03-17 12:00:55,399.399 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02401728555560112
2022-03-17 12:00:55,399.399 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:00:55,400.400 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'vehicles', 'that', '[MASK]', 'sitting', 'in', '[MASK]', 'street', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:00:55,415.415 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'car', 'cloud', 'light', 'pole', 'road', 'sign', 'street', 'bridge', 'line', 'traffic', 'tire', '[UNK]', 'wall', 'building', 'highway', 'grass', 'truck', 'fence', 'sidewalk', 'tree', 'window', 'parking', 'license', 'bus', 'van', 'person', 'arrow', 'curb', 'plate', 'windshield', 'lot', 'suv', 'wheel', 'intersection', 'barrier', 'tower', 'tail', 'vehicle', 'man', 'city', 'cone', 'bush', 'mirror', 'flag', 'freeway', 'busy', 'ground', 'large', 'water']
2022-03-17 12:01:11,327.327 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'group', 'line', 'building', 'road', 'street', 'light', 'car', 'design', 'bridge', 'window', 'sign', 'sky', 'clock', 'mirror', 'cloud', 'pole', 'barrel', 'fence', 'barrier', 'sidewalk', 'tire']
2022-03-17 12:03:35,077.077 2829:trainer.py:487 do_train_dict(): eta: 1:20:00  iter: 63700  speed: 266.9 images/sec  total_norm: 148.3544 (151.4542)  loss: 138.0045 (139.8757)  masked_loss: 1.3450 (1.3894)  tag_loss: 136.7826 (138.4863)  time: 1.4313 (1.9183)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4262 (1.9132)  save_time: 8.8421 (14.2643)  lr: 0.000004  max mem: 26307
2022-03-17 12:03:35,438.438 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-17 12:03:35,438.438 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 144.75341796875
2022-03-17 12:03:35,439.439 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.87280789438086
2022-03-17 12:04:07,445.445 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024043269455432892
2022-03-17 12:04:07,445.445 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:04:07,446.446 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'dog', 'is', '[MASK]', 'to', 'get', 'a', '[MASK]', '##is', '##bee', 'out', 'of', 'someone', "'", 's', 'hand', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:04:07,461.461 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['head', 'dog', 'floor', 'nose', '[UNK]', 'eye', 'ear', 'cat', 'plate', 'carpet', 'collar', 'shoe', 'face', 'black', 'snout', 'paw', 'leg', 'person', 'design', 'wire', 'cd', 'disc', 'cord', 'front', 'neck', 'next', 'body', 'shadow', 'wall', 'couch', 'pillow', 'table', 'foot', 'wheel', 'white', 'rug', 'button', 'top', 'back', 'light', 'ground', 'spot', 'brown', 'mouse', 'mouth', 'hand', 'chair', 'circle', 'computer', 'remote']
2022-03-17 12:04:23,435.435 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'someone', 'person', 'floor', 'eye', 'paper', 'foot', 'dog', 'nose', 'ear', 'cat', 'plate', 'carpet', 'shoe', 'paw']
2022-03-17 12:06:47,128.128 2829:trainer.py:487 do_train_dict(): eta: 1:17:04  iter: 63800  speed: 266.6 images/sec  total_norm: 150.0342 (152.7523)  loss: 137.9230 (138.1504)  masked_loss: 1.3779 (1.3900)  tag_loss: 136.0630 (136.7605)  time: 1.4327 (1.9205)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4274 (1.9153)  save_time: 8.8421 (14.2643)  lr: 0.000004  max mem: 26307
2022-03-17 12:06:47,489.489 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6285714507102966
2022-03-17 12:06:47,490.490 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 108.09188842773438
2022-03-17 12:06:47,490.490 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.88343372255424
2022-03-17 12:07:19,653.653 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024055734276771545
2022-03-17 12:07:19,654.654 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:07:19,654.654 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'white', 'paint', 'holding', 'food', 'eclectic', 'includes', '[MASK]', '##cco', '##li', 'and', 'meat', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:07:19,670.670 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', 'food', 'plate', '[UNK]', 'meat', 'glass', 'potato', 'knife', 'handle', 'liquid', 'beer', 'steak', 'cup', 'vegetable', 'chicken', 'blade', 'reflection', 'design', 'carrot', 'sauce', 'fork', 'drink', 'shadow', 'fish', 'beverage', 'meal', 'cheese', 'white', 'screw', 'mushroom', 'pepper', 'onion', 'bread', 'bowl', 'next', 'bottle', 'dinner', 'green', 'piece', 'tea', 'light', 'salad', 'leaf', 'corn', 'juice', 'paper', 'breast', 'pizza', 'different', 'blue']
2022-03-17 12:07:35,537.537 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'white', 'table', 'food', 'glass', 'handle', 'plate', 'shadow', 'beer', 'knife', 'meat', 'liquid', 'paint', 'cake', 'potato', 'steak']
2022-03-17 12:09:59,004.004 2829:trainer.py:487 do_train_dict(): eta: 1:14:07  iter: 63900  speed: 266.8 images/sec  total_norm: 147.0879 (149.2175)  loss: 137.6447 (138.8817)  masked_loss: 1.4027 (1.4419)  tag_loss: 136.2104 (137.4398)  time: 1.4327 (1.9188)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4275 (1.9136)  save_time: 8.8421 (14.2643)  lr: 0.000004  max mem: 26307
2022-03-17 12:09:59,366.366 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6944444179534912
2022-03-17 12:09:59,366.366 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 152.9551544189453
2022-03-17 12:09:59,366.366 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.88531295657158
2022-03-17 12:10:31,182.182 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024061333388090134
2022-03-17 12:10:31,183.183 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:10:31,184.184 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'red', 'bus', 'driving', 'down', 'an', 'english', 'street', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:10:31,199.199 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sky', 'clock', 'tower', 'bus', 'road', 'building', 'line', 'window', 'car', 'street', 'tree', '[UNK]', 'cloud', 'wheel', 'tire', 'person', 'plate', 'license', 'castle', 'arrow', 'spire', 'sidewalk', 'sign', 'back', 'light', 'roof', 'top', 'decker', 'windshield', 'door', 'front', 'shadow', 'city', 'pole', 'fence', 'flag', 'cone', 'red', 'double', 'cross', 'man', 'truck', 'tall', 'curb', 'large', 'background', 'passenger', 'traffic', 'side', 'busy']
2022-03-17 12:10:47,200.200 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'line', 'building', 'large', 'top', 'door', 'road', 'front', 'street', 'red', 'car', 'castle', 'window', 'tree', 'tower', 'sign', 'sky', 'bus', 'clock', 'plate', 'wheel', 'license', 'cloud', 'arrow', 'tire', 'cone', 'windshield', 'spire']
2022-03-17 12:13:11,217.217 2829:trainer.py:487 do_train_dict(): eta: 1:11:11  iter: 64000  speed: 266.4 images/sec  total_norm: 150.0655 (153.2719)  loss: 137.4899 (138.0345)  masked_loss: 1.4258 (1.4411)  tag_loss: 135.7237 (136.5934)  time: 1.4342 (1.9221)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4289 (1.9169)  save_time: 8.8421 (14.2643)  lr: 0.000004  max mem: 26307
2022-03-17 12:13:11,578.578 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.65625
2022-03-17 12:13:11,578.578 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 165.48831176757812
2022-03-17 12:13:11,579.579 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.88359491761872
03-17 12:13:33.905 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 12:13:33.905 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 12:13:34.585 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}]
2022-03-17 12:13:43,286.286 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024044038727879524
2022-03-17 12:13:43,286.286 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:13:43,287.287 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'man', 'sitting', 'at', 'a', '[MASK]', '[MASK]', 'looking', 'at', 'a', 'computer', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:13:43,302.302 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['curtain', 'shirt', 'table', 'man', 'laptop', 'floor', 'leg', 'wall', 'hair', 'keyboard', 'screen', 'short', 'computer', '[UNK]', 'room', 'window', 'foot', 'head', 'face', 'hand', 'chair', 'beard', 'coffee', 'ear', 'door', 'television', 'glasses', 'nose', 'cup', 'cord', 'desk', 'shoe', 'picture', 'camera', 'handle', 'mouse', 'monitor', 'sock', 'phone', 'arm', 'mug', 'bed', 'rug', 'mouth', 'top', 'pillow', 'book', 'stand', 'lamp', 'front']
2022-03-17 12:13:59,275.275 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'room', 'door', 'cup', 'short', 'television', 'hair', 'floor', 'table', 'wall', 'seat', 'chair', 'computer', 'window', 'shirt', 'kitchen', 'leg', 'roof', 'ear', 'camera', 'coat', 'deck', 'jacket', 'sink', 'monitor', 'keyboard', 'curtain', 'cord', 'laptop']
2022-03-17 12:16:23,229.229 2829:trainer.py:487 do_train_dict(): eta: 1:08:15  iter: 64100  speed: 266.7 images/sec  total_norm: 149.5569 (151.4975)  loss: 139.1717 (141.6541)  masked_loss: 1.3183 (1.3352)  tag_loss: 137.5977 (140.3190)  time: 1.4330 (1.9202)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4279 (1.9150)  save_time: 8.8421 (14.2643)  lr: 0.000003  max mem: 26307
2022-03-17 12:16:23,590.590 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7058823704719543
2022-03-17 12:16:23,590.590 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.14840698242188
2022-03-17 12:16:23,591.591 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.8877135362952
2022-03-17 12:16:55,478.478 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024037275463342667
2022-03-17 12:16:55,479.479 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:16:55,479.479 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'the', 'birds', '[MASK]', 'feeding', '[MASK]', 'the', 'bird', 'feeder', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:16:55,494.494 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bird', 'feeder', 'branch', 'tail', 'tree', 'seed', 'head', 'cage', 'wire', 'leaf', 'hole', 'chain', 'food', 'hook', 'wing', '[UNK]', 'feather', 'beak', 'pole', 'eye', 'basket', 'metal', 'window', 'water', 'container', 'wall', 'foot', 'dish', 'leg', 'handle', 'trunk', 'small', 'top', 'tray', 'object', 'hanging', 'mesh', 'ground', 'vine', 'light', 'plant', 'next', 'string', 'base', 'building', 'cord', 'box', 'spot', 'bean', 'holder']
2022-03-17 12:17:11,441.441 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'food', 'tree', 'branch', 'sky', 'chain', 'bird', 'hole', 'tail', 'seed', 'pole', 'leaf', 'wire', 'cage', 'hook', 'feather', 'feeder']
2022-03-17 12:19:35,440.440 2829:trainer.py:487 do_train_dict(): eta: 1:05:18  iter: 64200  speed: 266.4 images/sec  total_norm: 147.1614 (150.7025)  loss: 139.2600 (139.8896)  masked_loss: 1.3733 (1.4037)  tag_loss: 138.0028 (138.4859)  time: 1.4330 (1.9221)  data: 0.0001 (0.0005)  to_device: 0.0050 (0.0049)  time_gpu: 1.4278 (1.9166)  save_time: 8.8421 (14.2643)  lr: 0.000003  max mem: 26307
2022-03-17 12:19:35,801.801 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.59375
2022-03-17 12:19:35,802.802 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 131.42889404296875
2022-03-17 12:19:35,802.802 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.88970443583202
2022-03-17 12:20:08,078.078 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02407951094210148
2022-03-17 12:20:08,079.079 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:20:08,079.079 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'bunch', 'of', 'cows', 'that', 'are', '[MASK]', 'the', 'grass', '.', '1979', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:20:08,094.094 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['cow', 'building', 'tree', 'grass', 'window', 'roof', 'head', 'sky', 'ear', 'chimney', 'pole', 'road', 'fence', 'post', 'house', 'field', 'nose', 'face', 'cloud', 'wall', 'barn', '[UNK]', 'sign', 'herd', 'sheep', 'line', 'cattle', 'pasture', 'green', 'animal', 'calf', 'leg', 'door', 'truck', 'wire', 'wheel', 'ground', 'car', 'hill', 'tag', 'rope', 'large', 'eye', 'wood', 'collar', 'background', 'white', 'farm', 'forest', 'spot']
2022-03-17 12:20:24,001.001 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'face', 'building', 'door', 'road', 'field', 'post', 'wall', 'window', 'tree', 'sky', 'roof', 'nose', 'ear', 'grass', 'cloud', 'pole', 'barn', 'bunch', 'cow', 'chimney']
2022-03-17 12:22:47,563.563 2829:trainer.py:487 do_train_dict(): eta: 1:02:22  iter: 64300  speed: 266.5 images/sec  total_norm: 149.3504 (153.0083)  loss: 139.2202 (137.7410)  masked_loss: 1.3827 (1.4195)  tag_loss: 137.9611 (136.3215)  time: 1.4334 (1.9212)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4282 (1.9161)  save_time: 8.8421 (14.2643)  lr: 0.000003  max mem: 26307
2022-03-17 12:22:47,924.924 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-17 12:22:47,924.924 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 140.81643676757812
2022-03-17 12:22:47,924.924 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.89210189795642
2022-03-17 12:23:20,020.020 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02411685697734356
2022-03-17 12:23:20,021.021 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:23:20,021.021 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'woman', '[MASK]', 'down', 'while', 'holding', 'a', 'black', 'umbrella', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:23:20,037.037 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'umbrella', 'hand', 'head', '[UNK]', 'shoe', 'woman', 'ground', 'watch', 'foot', 'bag', 'face', 'handle', 'ear', 'person', 'building', 'floor', 'scarf', 'nose', 'leg', 'jacket', 'shirt', 'arm', 'door', 'girl', 'backpack', 'pole', 'sidewalk', 'mouth', 'flop', 'glasses', 'hair', 'eye', 'flip', 'man', 'lady', 'cloth', 'rock', 'clothing', 'child', 'hood', 'strap', 'pipe', 'ledge', 'dress', 'pink', 'graffiti', 'stripe', 'brick', 'hole']
2022-03-17 12:23:35,964.964 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'black', 'building', 'door', 'woman', 'ground', 'floor', 'wall', 'lady', 'eye', 'foot', 'watch', 'bag', 'ear', 'chain', 'handle', 'glasses', 'towel', 'shoe', 'umbrella', 'graffiti', 'scarf']
2022-03-17 12:25:59,877.877 2829:trainer.py:487 do_train_dict(): eta: 0:59:25  iter: 64400  speed: 266.2 images/sec  total_norm: 148.6476 (150.0444)  loss: 139.4954 (138.3436)  masked_loss: 1.4479 (1.4636)  tag_loss: 138.1433 (136.8799)  time: 1.4333 (1.9232)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4281 (1.9181)  save_time: 8.8421 (14.2643)  lr: 0.000003  max mem: 26307
2022-03-17 12:26:00,237.237 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.65625
2022-03-17 12:26:00,237.237 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 111.07911682128906
2022-03-17 12:26:00,237.237 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.90109275995299
2022-03-17 12:26:32,329.329 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024121245369315147
2022-03-17 12:26:32,329.329 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:26:32,329.329 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '##typic', 'of', 'a', 'pub', '[MASK]', 'named', 'the', 'lion', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:26:32,345.345 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sign', 'building', 'sky', 'light', 'pole', 'street', 'window', 'tree', 'store', 'roof', '[UNK]', 'car', 'sidewalk', 'wall', 'city', 'person', 'road', 'stop', 'traffic', 'door', 'letter', 'man', 'line', 'trash', 'can', 'fire', 'lamp', 'jacket', 'restaurant', 'arrow', 'curb', 'post', 'cloud', 'mirror', 'flag', 'corner', 'shirt', 'chimney', 'banner', 'bag', 'pipe', 'plant', 'coat', 'reflection', 'shop', 'fence', 'intersection', 'woman', 'night', 'balcony']
2022-03-17 12:26:48,255.255 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'building', 'road', 'street', 'light', 'car', 'person', 'wall', 'view', 'paper', 'window', 'store', 'letter', 'sign', 'sky', 'coat', 'pole', 'pub', 'lamp', 'sidewalk']
2022-03-17 12:29:12,080.080 2829:trainer.py:487 do_train_dict(): eta: 0:56:29  iter: 64500  speed: 266.4 images/sec  total_norm: 147.1693 (154.0176)  loss: 138.6938 (139.7664)  masked_loss: 1.4062 (1.4373)  tag_loss: 136.6238 (138.3291)  time: 1.4319 (1.9220)  data: 0.0001 (0.0001)  to_device: 0.0051 (0.0050)  time_gpu: 1.4267 (1.9168)  save_time: 8.8421 (14.2643)  lr: 0.000003  max mem: 26307
2022-03-17 12:29:12,440.440 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7777777910232544
2022-03-17 12:29:12,441.441 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 151.95896911621094
2022-03-17 12:29:12,441.441 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.89935798172611
2022-03-17 12:29:44,655.655 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02411646395921707
2022-03-17 12:29:44,655.655 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:29:44,656.656 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[unused516]', 'red', 'and', 'white', 'boat', 'parked', 'next', '[MASK]', 'a', 'house', 'with', 'a', 'woman', 'standing', 'next', 'to', 'a', 'dog', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:29:44,671.671 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['boat', 'bush', 'water', 'tree', 'reflection', 'man', 'wall', 'roof', 'person', 'shirt', 'short', 'building', 'dog', 'flag', 'woman', 'hedge', 'sky', 'grass', '[UNK]', 'hair', 'writing', 'house', 'motor', 'jacket', 'hat', 'head', 'small', 'plant', 'engine', 'bottom', 'red', 'pole', 'dock', 'post', 'rope', 'flower', 'chair', 'hand', 'window', 'door', 'lamp', 'front', 'name', 'river', 'wheel', 'boy', 'skirt', 'leg', 'car', 'white']
2022-03-17 12:30:00,631.631 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'house', 'name', 'next', 'water', 'building', 'white', 'red', 'woman', 'short', 'hair', 'person', 'wall', 'engine', 'tree', 'sky', 'shirt', 'dog', 'boat', 'roof', 'flag', 'grass', 'bush', 'hat', 'ski', 'reflection', 'hedge']
2022-03-17 12:32:24,435.435 2829:trainer.py:487 do_train_dict(): eta: 0:53:32  iter: 64600  speed: 266.2 images/sec  total_norm: 148.8909 (152.3475)  loss: 140.1122 (139.7863)  masked_loss: 1.3984 (1.3970)  tag_loss: 138.6306 (138.3893)  time: 1.4319 (1.9236)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4267 (1.9184)  save_time: 8.8421 (14.2643)  lr: 0.000003  max mem: 26307
2022-03-17 12:32:24,796.796 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6666666865348816
2022-03-17 12:32:24,797.797 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 157.34005737304688
2022-03-17 12:32:24,797.797 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.89721969179982
2022-03-17 12:32:57,019.019 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024100879207253456
2022-03-17 12:32:57,020.020 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:32:57,020.020 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'some', 'cartoon', 'character', '[MASK]', 'are', '[MASK]', 'in', 'this', 'photo', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:32:57,035.035 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'table', 'brick', 'mouth', 'pen', 'box', 'face', 'eye', 'toy', 'button', '[UNK]', 'book', 'handle', 'carrot', 'nose', 'phone', 'pencil', 'stand', 'label', 'marker', 'bottle', 'base', 'pumpkin', 'shadow', 'display', 'screen', 'block', 'holder', 'orange', 'drawer', 'top', 'cord', 'cap', 'next', 'wooden', 'bag', 'container', 'teeth', 'floor', 'antenna', 'tag', 'cloth', 'strap', 'cell', 'other', 'room', 'item', 'stack', 'cover', 'key']
2022-03-17 12:33:12,874.874 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'mouth', 'table', 'wall', 'character', 'cover', 'stand', 'eye', 'box', 'label', 'nose', 'display', 'handle', 'brick', 'apple', 'photo', 'button', 'pen', 'item', 'toy', 'cartoon', 'marker', 'strap']
2022-03-17 12:35:36,940.940 2829:trainer.py:487 do_train_dict(): eta: 0:50:36  iter: 64700  speed: 266.0 images/sec  total_norm: 147.8625 (151.1586)  loss: 137.2090 (139.5478)  masked_loss: 1.3045 (1.3882)  tag_loss: 135.9091 (138.1596)  time: 1.4320 (1.9250)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0051)  time_gpu: 1.4269 (1.9198)  save_time: 8.8421 (14.2643)  lr: 0.000003  max mem: 26307
2022-03-17 12:35:37,300.300 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-17 12:35:37,300.300 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 124.28950500488281
2022-03-17 12:35:37,301.301 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.90295072838113
2022-03-17 12:36:09,741.741 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024104544892907143
2022-03-17 12:36:09,742.742 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:36:09,742.742 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'holding', 'up', 'a', 'cell', 'phone', 'in', 'front', '[MASK]', 'her', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:36:09,757.757 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hair', 'head', 'shirt', 'bang', 'phone', 'hand', 'jacket', 'girl', 'man', 'woman', 'person', 'sleeve', 'eye', 'cell', 'cuff', 'wall', 'picture', '[UNK]', 'button', 'ear', 'face', 'nose', 'door', 'coat', 'camera', 'finger', 'screen', 'arm', 'green', 'young', 'light', 'blonde', 'sign', 'photo', 'pole', 'ceiling', 'chair', 'glasses', 'cup', 'ponytail', 'jean', 'cabinet', 'front', 'background', 'sweater', 'window', 'lady', 'blond', 'glass', 'building']
2022-03-17 12:36:25,642.642 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['head', 'man', 'hand', 'face', 'front', 'woman', 'hair', 'girl', 'person', 'wall', 'phone', 'eye', 'cell', 'shirt', 'screen', 'finger', 'ear', 'coat', 'jacket', 'bang', 'sleeve', 'cuff']
2022-03-17 12:38:49,177.177 2829:trainer.py:487 do_train_dict(): eta: 0:47:39  iter: 64800  speed: 266.3 images/sec  total_norm: 149.3931 (151.4496)  loss: 134.2734 (136.5747)  masked_loss: 1.3836 (1.4233)  tag_loss: 132.1940 (135.1515)  time: 1.4319 (1.9224)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0049)  time_gpu: 1.4266 (1.9173)  save_time: 8.8421 (14.2643)  lr: 0.000002  max mem: 26307
2022-03-17 12:38:49,538.538 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.47058823704719543
2022-03-17 12:38:49,538.538 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 158.97872924804688
2022-03-17 12:38:49,539.539 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.8952772319776
2022-03-17 12:39:22,172.172 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02409748174250126
2022-03-17 12:39:22,173.173 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:39:22,173.173 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'two', 'cats', 'sitting', 'on', 'a', 'lounge', 'chair', '[MASK]', 'looking', 'out', 'a', 'window', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:39:22,189.189 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['window', 'cat', 'ear', 'floor', 'car', 'head', 'leg', 'wall', 'paw', 'bench', 'couch', 'cushion', 'building', 'tail', '[UNK]', 'light', 'reflection', 'chair', 'shadow', 'black', 'arm', 'seat', 'sofa', 'room', 'pillow', 'frame', 'towel', 'bolt', 'small', 'nose', 'yellow', 'next', 'lamp', 'white', 'tire', 'carpet', 'table', 'blanket', 'wheel', 'dog', 'book', 'foot', 'snow', 'suitcase', 'bed', 'large', 'road', 'front', 'sun', 'top']
2022-03-17 12:39:38,131.131 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'building', 'book', 'car', 'floor', 'wall', 'seat', 'arm', 'chair', 'window', 'ear', 'cat', 'tail', 'couch', 'bench', 'shelf', 'lounge', 'paw', 'cushion']
2022-03-17 12:42:01,805.805 2829:trainer.py:487 do_train_dict(): eta: 0:44:42  iter: 64900  speed: 265.8 images/sec  total_norm: 147.9228 (150.3981)  loss: 137.4824 (137.6302)  masked_loss: 1.3522 (1.3890)  tag_loss: 136.4287 (136.2412)  time: 1.4310 (1.9263)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4259 (1.9213)  save_time: 8.8421 (14.2643)  lr: 0.000002  max mem: 26307
2022-03-17 12:42:02,165.165 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5
2022-03-17 12:42:02,165.165 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 154.29269409179688
2022-03-17 12:42:02,165.165 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.89981867276705
2022-03-17 12:42:34,790.790 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024111177772283554
2022-03-17 12:42:34,790.790 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:42:34,791.791 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'large', '[MASK]', 'in', 'short', 'brown', 'hair', 'don', '##s', 'a', '[MASK]', '[MASK]', 'girl', 'outfit', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:42:34,806.806 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['hand', 'skirt', 'shirt', 'wall', 'leg', 'floor', 'box', 'woman', 'belt', 'television', 'hair', 'face', 'bag', '[UNK]', 'mouth', 'man', 'arm', 'monitor', 'head', 'nose', 'ground', 'computer', 'eye', 'outlet', 'tile', 'desk', 'handle', 'picture', 'person', 'finger', 'shoe', 'dress', 'shelf', 'ceiling', 'cord', 'girl', 'book', 'room', 'frame', 'screen', 'drawer', 'cabinet', 'paper', 'table', 'stripe', 'sock', 'glasses', 'cardboard', 'light', 'stand']
2022-03-17 12:42:50,813.813 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'school', 'head', 'man', 'hand', 'large', 'book', 'woman', 'short', 'television', 'ground', 'hair', 'girl', 'person', 'floor', 'wall', 'brown', 'smile', 'computer', 'box', 'border', 'shirt', 'picture', 'screen', 'leg', 'bag', 'desk', 'frame', 'tie', 'belt', 'stick', 'monitor', 'collar', 'skirt', 'pillow', 'outfit', 'shelf', 'drawer', 'tile']
03-17 12:43:34.669 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 12:43:34.669 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 12:43:35.884 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 12:45:14,199.199 2829:trainer.py:487 do_train_dict(): eta: 0:41:46  iter: 65000  speed: 266.1 images/sec  total_norm: 148.2992 (151.1839)  loss: 138.9783 (139.3487)  masked_loss: 1.4523 (1.4882)  tag_loss: 137.3715 (137.8605)  time: 1.4306 (1.9238)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4255 (1.9186)  save_time: 8.8421 (14.2643)  lr: 0.000002  max mem: 26307
2022-03-17 12:45:14,201.201 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt
2022-03-17 12:45:23,646.646 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6764705777168274
2022-03-17 12:45:23,647.647 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 148.27684020996094
2022-03-17 12:45:23,647.647 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.89644677298409
2022-03-17 12:45:56,045.045 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024094535037875175
2022-03-17 12:45:56,045.045 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:45:56,046.046 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'television', 'is', 'sitting', '[MASK]', '[MASK]', 'stand', 'with', 'toys', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:45:56,061.061 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'handle', 'drawer', 'television', 'toy', 'cabinet', 'curtain', 'dresser', 'screen', 'shirt', 'floor', 'hat', 'blanket', 'bed', 'house', '[UNK]', 'pillow', 'bag', 'table', 'picture', 'box', 'frame', 'window', 'track', 'baby', 'clothes', 'room', 'couch', 'train', 'top', 'doll', 'horse', 'chair', 'desk', 'head', 'small', 'door', 'block', 'sheet', 'shoe', 'leg', 'cap', 'hair', 'cloth', 'stand', 'basket', 'child', 'man', 'person', 'decoration']
2022-03-17 12:46:11,952.952 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'television', 'bed', 'wall', 'stand', 'block', 'shirt', 'picture', 'screen', 'clothes', 'bag', 'desk', 'frame', 'handle', 'cabinet', 'bow', 'sheet', 'shade', 'blanket', 'toy', 'pillow', 'lamp', 'curtain', 'drawer', 'dresser']
2022-03-17 12:48:34,626.626 2829:trainer.py:487 do_train_dict(): eta: 0:38:49  iter: 65100  speed: 255.5 images/sec  total_norm: 149.4908 (150.6414)  loss: 138.9427 (139.9967)  masked_loss: 1.4009 (1.4396)  tag_loss: 137.5636 (138.5571)  time: 1.4302 (2.0044)  data: 0.0001 (0.0002)  to_device: 0.0050 (0.0048)  time_gpu: 1.4251 (1.9086)  save_time: 8.8805 (13.8650)  lr: 0.000002  max mem: 26307
2022-03-17 12:48:34,988.988 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7352941036224365
2022-03-17 12:48:34,988.988 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 130.5562744140625
2022-03-17 12:48:34,988.988 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.89891885979775
2022-03-17 12:49:07,678.678 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02409212850034237
2022-03-17 12:49:07,678.678 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:49:07,679.679 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'villiers', '[MASK]', 'on', 'a', 'snowy', 'surface', 'wit', 'ha', 'kite', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:49:07,694.694 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shadow', 'snow', 'ground', 'kite', '[UNK]', 'hand', 'coat', 'jacket', 'track', 'arm', 'head', 'hair', 'string', 'leg', 'face', 'boot', 'hat', 'girl', 'glove', 'person', 'woman', 'shoe', 'stick', 'shirt', 'ski', 'pole', 'hood', 'scarf', 'rope', 'branch', 'boy', 'tree', 'foot', 'man', 'sky', 'flower', 'backpack', 'young', 'mouth', 'leaf', 'blue', 'eye', 'sunglasses', 'glasses', 'short', 'rock', 'tail', 'wire', 'sun', 'design']
2022-03-17 12:49:23,588.588 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'hand', 'face', 'ground', 'hair', 'girl', 'person', 'child', 'arm', 'surface', 'leg', 'snow', 'string', 'shadow', 'ha', 'coat', 'bottle', 'hat', 'flower', 'jacket', 'leaf', 'hood', 'rope', 'boot', 'shoe', 'glove', 'wit', 'kite', 'snowy']
2022-03-17 12:51:47,051.051 2829:trainer.py:487 do_train_dict(): eta: 0:35:52  iter: 65200  speed: 266.1 images/sec  total_norm: 148.1385 (152.8203)  loss: 137.1125 (138.2405)  masked_loss: 1.3537 (1.3640)  tag_loss: 135.5274 (136.8765)  time: 1.4317 (1.9242)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4264 (1.9190)  save_time: 8.8805 (13.8650)  lr: 0.000002  max mem: 26307
2022-03-17 12:51:47,411.411 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6000000238418579
2022-03-17 12:51:47,412.412 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 143.08262634277344
2022-03-17 12:51:47,412.412 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.8989073737292
2022-03-17 12:52:20,005.005 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024089697748422623
2022-03-17 12:52:20,005.005 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:52:20,006.006 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'usb', 'hub', 'with', 'multiple', 'electronics', 'plug', '[MASK]', 'in', '98', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:52:20,021.021 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', 'laptop', 'keyboard', 'desk', 'ball', 'computer', 'key', 'phone', 'screen', 'cord', 'wall', 'ipod', '[UNK]', 'mouse', 'logo', 'cell', 'wheel', 'wire', 'floor', 'pen', 'printer', 'pad', 'monitor', 'paper', 'equipment', 'button', 'electronic', 'electronics', 'camera', 'case', 'device', 'knob', 'black', 'speaker', 'base', 'wallet', 'cd', 'object', 'wooden', 'other', 'antenna', 'cable', 'box', 'plug', 'circle', 'next', 'small', 'book', 'cup', 'bottle']
2022-03-17 12:52:35,963.963 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'case', 'table', 'wall', 'phone', 'key', 'paper', 'ball', 'multiple', 'cd', 'desk', 'wheel', 'mouse', 'logo', 'keyboard', 'hub', 'cord', 'pad', 'laptop', 'printer', 'ipod']
2022-03-17 12:54:59,700.700 2829:trainer.py:487 do_train_dict(): eta: 0:32:55  iter: 65300  speed: 265.8 images/sec  total_norm: 147.7688 (149.2258)  loss: 137.2031 (136.0783)  masked_loss: 1.3435 (1.3949)  tag_loss: 135.5815 (134.6835)  time: 1.4319 (1.9265)  data: 0.0001 (0.0005)  to_device: 0.0049 (0.0048)  time_gpu: 1.4270 (1.9212)  save_time: 8.8805 (13.8650)  lr: 0.000002  max mem: 26307
2022-03-17 12:55:00,060.060 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5882353186607361
2022-03-17 12:55:00,060.060 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 155.49758911132812
2022-03-17 12:55:00,061.061 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.9048068020322
2022-03-17 12:55:32,540.540 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02407936006784439
2022-03-17 12:55:32,540.540 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:55:32,541.541 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '##raf', '##fe', '[MASK]', 'standing', 'in', 'a', 'grassy', 'field', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:55:32,556.556 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['bush', 'leg', 'tree', 'sky', 'neck', 'grass', '[UNK]', 'head', 'field', 'shadow', 'tail', 'ground', 'ear', 'cloud', 'hair', 'background', 'spot', 'horn', 'mane', 'dirt', 'standing', 'grassy', 'body', 'tall', 'open', 'area', 'next', 'plain', 'animal', 'face', 'wild', 'green', 'shrub', 'mouth', 'grazing', 'branch', 'distance', 'stand', 'front', 'dry', 'large', 'lone', 'middle', 'walking', 'day', 'adult', 'small', 'man', 'habitat', 'savannah']
2022-03-17 12:55:48,465.465 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'field', 'ground', 'hair', 'neck', 'tree', 'sky', 'spot', 'leg', 'ear', 'shadow', 'grass', 'tail', 'bush', 'dirt', 'grassy']
2022-03-17 12:58:12,289.289 2829:trainer.py:487 do_train_dict(): eta: 0:29:58  iter: 65400  speed: 265.9 images/sec  total_norm: 148.6661 (150.4916)  loss: 138.8993 (139.9551)  masked_loss: 1.3772 (1.4394)  tag_loss: 137.7617 (138.5157)  time: 1.4331 (1.9259)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4279 (1.9208)  save_time: 8.8805 (13.8650)  lr: 0.000002  max mem: 26307
2022-03-17 12:58:12,649.649 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7714285850524902
2022-03-17 12:58:12,649.649 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 132.91424560546875
2022-03-17 12:58:12,649.649 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.90525168498964
2022-03-17 12:58:45,560.560 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024068038910627365
2022-03-17 12:58:45,561.561 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 12:58:45,561.561 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'truck', 'with', 'wood', 'side', 'rails', 'in', 'the', 'back', ',', '[MASK]', 'in', 'a', 'parking', 'space', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 12:58:45,577.577 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['tree', 'truck', 'sky', 'light', 'tire', 'line', 'bumper', 'window', 'road', 'pole', 'building', 'back', 'plate', 'car', 'ground', 'license', 'fence', 'door', 'tail', 'sign', 'wheel', 'bed', 'pickup', 'street', 'handle', 'lot', 'wood', 'wire', 'mirror', 'parking', 'logo', 'wall', 'curb', 'rim', 'cloud', 'grass', 'pick', '[UNK]', 'shadow', 'old', 'power', 'roof', 'post', 'side', 'white', 'house', 'chain', 'next', 'front', 'bush']
2022-03-17 12:59:01,563.563 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['back', 'side', 'line', 'building', 'road', 'street', 'light', 'car', 'ground', 'space', 'person', 'bed', 'wall', 'window', 'tree', 'wood', 'sky', 'truck', 'plate', 'shadow', 'wheel', 'mirror', 'brick', 'grass', 'parking', 'tail', 'license', 'pole', 'fence', 'rim', 'tire', 'curb', 'railing', 'pedal', 'bumper']
2022-03-17 13:01:24,945.945 2829:trainer.py:487 do_train_dict(): eta: 0:27:01  iter: 65500  speed: 265.8 images/sec  total_norm: 148.7471 (153.3763)  loss: 137.3178 (139.4328)  masked_loss: 1.4122 (1.3989)  tag_loss: 135.7238 (138.0338)  time: 1.4319 (1.9266)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4269 (1.9215)  save_time: 8.8805 (13.8650)  lr: 0.000001  max mem: 26307
2022-03-17 13:01:25,305.305 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6470588445663452
2022-03-17 13:01:25,306.306 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 98.13138580322266
2022-03-17 13:01:25,306.306 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.91387111966203
2022-03-17 13:01:58,029.029 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024125345051288605
2022-03-17 13:01:58,029.029 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 13:01:58,030.030 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'there', 'are', 'airplanes', 'waiting', 'to', '[MASK]', 'off', '[MASK]', 'the', 'runway', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 13:01:58,045.045 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['line', 'airplane', 'tail', 'wing', 'sky', 'window', 'wheel', 'door', 'runway', 'airport', 'ground', '[UNK]', 'cloud', 'engine', 'road', 'plane', 'tree', 'logo', 'nose', 'building', 'mountain', 'large', 'stripe', 'gear', 'fuselage', 'landing', 'pole', 'tire', 'view', 'letter', 'grass', 'commercial', 'front', 'white', 'blue', 'vehicle', 'mirror', 'cone', 'jet', 'red', 'propeller', 'frame', 'way', 'fin', 'stair', 'name', 'windshield', 'small', 'side', 'truck']
2022-03-17 13:02:13,965.965 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'line', 'door', 'road', 'ground', 'window', 'wing', 'tree', 'sky', 'nose', 'wheel', 'tail', 'runway', 'airplane']
2022-03-17 13:04:37,724.724 2829:trainer.py:487 do_train_dict(): eta: 0:24:04  iter: 65600  speed: 265.6 images/sec  total_norm: 148.8321 (152.0124)  loss: 136.5320 (137.9185)  masked_loss: 1.3544 (1.3954)  tag_loss: 135.2360 (136.5231)  time: 1.4316 (1.9277)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4263 (1.9227)  save_time: 8.8805 (13.8650)  lr: 0.000001  max mem: 26307
2022-03-17 13:04:38,089.089 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6176470518112183
2022-03-17 13:04:38,089.089 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 145.99453735351562
2022-03-17 13:04:38,089.089 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.91470381622082
2022-03-17 13:05:11,211.211 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024118809029459953
2022-03-17 13:05:11,211.211 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 13:05:11,211.211 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'boats', 'in', 'a', 'river', 'with', 'tall', 'buildings', 'on', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 13:05:11,227.227 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['water', 'building', 'sky', 'boat', 'tree', 'beach', 'city', 'shore', 'window', 'bridge', 'cloud', 'mountain', 'background', '[UNK]', 'roof', 'pole', 'structure', 'body', 'sand', 'skyscraper', 'mast', 'ocean', 'large', 'bottom', 'umbrella', 'canopy', 'bush', 'bird', 'rock', 'dock', 'lake', 'shoreline', 'tower', 'antenna', 'flag', 'blue', 'palm', 'metal', 'hill', 'harbor', 'reflection', 'ship', 'front', 'distance', 'top', 'river', 'ball', 'sail', 'sea', 'bay']
2022-03-17 13:05:27,157.157 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['city', 'water', 'building', 'river', 'land', 'structure', 'window', 'metal', 'tree', 'beach', 'sky', 'bottom', 'boat', 'tall', 'shore', 'dock']
2022-03-17 13:07:50,632.632 2829:trainer.py:487 do_train_dict(): eta: 0:21:07  iter: 65700  speed: 265.4 images/sec  total_norm: 149.1738 (150.6159)  loss: 138.4534 (139.1638)  masked_loss: 1.4229 (1.4210)  tag_loss: 136.9041 (137.7428)  time: 1.4317 (1.9291)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4263 (1.9240)  save_time: 8.8805 (13.8650)  lr: 0.000001  max mem: 26307
2022-03-17 13:07:50,993.993 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5757575631141663
2022-03-17 13:07:50,993.993 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 136.94309997558594
2022-03-17 13:07:50,993.993 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.92130625646527
2022-03-17 13:08:23,885.885 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024174001067876816
2022-03-17 13:08:23,885.885 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 13:08:23,886.886 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'many', 'people', 'stand', '[MASK]', 'a', 'tennis', '[MASK]', 'with', 'rack', '##ets', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 13:08:23,901.901 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'man', 'short', 'shadow', 'shoe', 'tennis', '[UNK]', 'fence', 'court', 'ball', 'line', 'head', 'hand', 'hat', 'tree', 'boy', 'cap', 'ground', 'bat', 'person', 'hair', 'building', 'pole', 'group', 'sunglasses', 'bush', 'sock', 'roof', 'player', 'glasses', 'grass', 'leg', 'woman', 'arm', 'young', 'baseball', 'net', 'house', 'sign', 'handle', 'sky', 'couple', 'logo', 'light', 'face', 'jersey', 'game', 'uniform', 'watch', 'playing']
2022-03-17 13:08:39,891.891 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'many', 'head', 'man', 'hand', 'line', 'court', 'short', 'hair', 'boy', 'tree', 'ball', 'shirt', 'tennis', 'shadow', 'hat', 'bat', 'glasses', 'fence', 'shoe', 'sunglasses', 'sock']
2022-03-17 13:11:03,795.795 2829:trainer.py:487 do_train_dict(): eta: 0:18:10  iter: 65800  speed: 265.1 images/sec  total_norm: 147.9253 (151.0860)  loss: 136.0638 (136.9490)  masked_loss: 1.4154 (1.4274)  tag_loss: 134.6400 (135.5215)  time: 1.4325 (1.9316)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4275 (1.9264)  save_time: 8.8805 (13.8650)  lr: 0.000001  max mem: 26307
2022-03-17 13:11:04,156.156 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-17 13:11:04,156.156 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 109.13992309570312
2022-03-17 13:11:04,156.156 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.93311189990123
2022-03-17 13:11:37,237.237 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024181626737117767
2022-03-17 13:11:37,238.238 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 13:11:37,238.238 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'close', 'up', 'of', 'a', '[MASK]', 'on', 'a', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 13:11:37,253.253 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['table', 'bottle', 'label', 'bread', 'sandwich', '[UNK]', 'wine', 'glass', 'jar', 'lid', 'food', 'cup', 'plate', 'container', 'wall', 'top', 'vegetable', 'stem', 'meat', 'book', 'pepper', 'menu', 'cheese', 'flower', 'leaf', 'tomato', 'candy', 'next', 'cherry', 'fork', 'napkin', 'can', 'cloth', 'picture', 'background', 'cap', 'spoon', 'paper', 'cookie', 'fruit', 'onion', 'lamp', 'red', 'basket', 'bowl', 'drink', 'salad', 'pole', 'chicken', 'knife']
2022-03-17 13:11:53,152.152 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'cup', 'close', 'table', 'wall', 'food', 'glass', 'branch', 'label', 'wine', 'plate', 'bottle', 'pole', 'bread', 'fork', 'sandwich', 'container', 'lid', 'menu', 'vegetable', 'jar', 'cookie']
03-17 13:13:35.985 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 13:13:35.985 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi
03-17 13:13:37.320 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}]
2022-03-17 13:14:17,015.015 2829:trainer.py:487 do_train_dict(): eta: 0:15:13  iter: 65900  speed: 265.0 images/sec  total_norm: 148.1385 (150.4678)  loss: 133.8431 (136.3024)  masked_loss: 1.2963 (1.3459)  tag_loss: 132.1955 (134.9565)  time: 1.4322 (1.9323)  data: 0.0001 (0.0001)  to_device: 0.0052 (0.0051)  time_gpu: 1.4269 (1.9270)  save_time: 8.8805 (13.8650)  lr: 0.000001  max mem: 26307
2022-03-17 13:14:17,376.376 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6060606241226196
2022-03-17 13:14:17,376.376 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 122.76383972167969
2022-03-17 13:14:17,376.376 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.941799065561
2022-03-17 13:14:50,123.123 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024181395769119263
2022-03-17 13:14:50,123.123 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 13:14:50,123.123 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'train', 'stops', 'on', 'the', 'tracks', 'outside', 'of', 'the', 'city', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 13:14:50,139.139 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['track', 'train', 'window', 'building', 'tree', 'ground', 'light', 'pole', 'sky', 'bridge', 'roof', 'car', 'line', '[UNK]', 'front', 'door', 'station', 'platform', 'rail', 'sign', 'fence', 'engine', 'bush', 'wall', 'railing', 'stripe', 'grass', 'person', 'number', 'traffic', 'background', 'windshield', 'top', 'tracks', 'bumper', 'beam', 'passenger', 'sidewalk', 'letter', 'railroad', 'wire', 'street', 'road', 'view', 'signal', 'man', 'wheel', 'black', 'white', 'logo']
2022-03-17 13:15:06,088.088 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'city', 'line', 'station', 'building', 'front', 'light', 'ground', 'track', 'person', 'bridge', 'window', 'train', 'tree', 'sky', 'rail', 'roof', 'bush', 'pole', 'stops', 'fence', 'chimney']
2022-03-17 13:17:30,143.143 2829:trainer.py:487 do_train_dict(): eta: 0:12:16  iter: 66000  speed: 265.1 images/sec  total_norm: 149.6373 (151.7165)  loss: 135.4386 (135.5210)  masked_loss: 1.3788 (1.3778)  tag_loss: 133.7582 (134.1432)  time: 1.4315 (1.9312)  data: 0.0001 (0.0002)  to_device: 0.0052 (0.0051)  time_gpu: 1.4262 (1.9260)  save_time: 8.8805 (13.8650)  lr: 0.000001  max mem: 26307
2022-03-17 13:17:30,505.505 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.7352941036224365
2022-03-17 13:17:30,505.505 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 137.94412231445312
2022-03-17 13:17:30,506.506 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.94123101054089
2022-03-17 13:18:03,528.528 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.02417757734656334
2022-03-17 13:18:03,528.528 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 13:18:03,529.529 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'in', 'a', 'classroom', 'eating', 'a', 'don', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 13:18:03,544.544 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['shirt', 'table', 'person', 'boy', 'hair', 'chair', 'woman', '[UNK]', 'eye', 'hand', 'wall', 'nose', 'man', 'light', 'ceiling', 'head', 'shoe', 'jean', 'window', 'stool', 'logo', 'arm', 'restaurant', 'sign', 'girl', 'ear', 'letter', 'food', 'plate', 'background', 'face', 'paper', 'poster', 'door', 'floor', 'writing', 'board', 'pizza', 'fan', 'leg', 'hat', 'shelf', 'box', 'cap', 'young', 'bag', 'bottle', 'cup', 'picture', 'child']
2022-03-17 13:18:19,511.511 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'book', 'door', 'light', 'woman', 'board', 'hair', 'girl', 'person', 'table', 'wall', 'boy', 'eye', 'chair', 'window', 'sign', 'jean', 'shirt', 'background', 'nose', 'restaurant', 'kid', 'logo', 'classroom', 'shoe']
2022-03-17 13:20:43,128.128 2829:trainer.py:487 do_train_dict(): eta: 0:09:18  iter: 66100  speed: 265.3 images/sec  total_norm: 147.4262 (149.1995)  loss: 139.3563 (138.9289)  masked_loss: 1.3532 (1.3958)  tag_loss: 137.9441 (137.5331)  time: 1.4318 (1.9299)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4265 (1.9247)  save_time: 8.8805 (13.8650)  lr: 0.000000  max mem: 26307
2022-03-17 13:20:43,489.489 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.8529411554336548
2022-03-17 13:20:43,489.489 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 113.00749969482422
2022-03-17 13:20:43,489.489 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.95179955404691
2022-03-17 13:21:16,746.746 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024219419807195663
2022-03-17 13:21:16,746.746 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 13:21:16,747.747 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'white', 'toilet', 'sitting', 'next', 'to', 'a', '[MASK]', 'tub', 'and', 'a', 'sink', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 13:21:16,762.762 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['wall', 'bathroom', '[UNK]', 'tub', 'sink', 'toilet', 'mirror', 'floor', 'shower', 'shelf', 'pipe', 'lid', 'hose', 'tank', 'handle', 'ceiling', 'seat', 'head', 'light', 'bath', 'drain', 'soap', 'cabinet', 'tile', 'white', 'dish', 'knob', 'reflection', 'door', 'small', 'outlet', 'window', 'cord', 'bowl', 'vent', 'bottle', 'brush', 'plug', 'rack', 'towel', 'paper', 'rod', 'curtain', 'cup', 'base', 'hole', 'basket', 'bar', 'holder', 'vanity']
2022-03-17 13:21:32,692.692 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'head', 'next', 'white', 'light', 'floor', 'wall', 'seat', 'tank', 'handle', 'mirror', 'bathroom', 'bottle', 'ceiling', 'shower', 'bath', 'sink', 'brush', 'pipe', 'reflection', 'shelf', 'toilet', 'lid', 'tub', 'vent', 'hose']
2022-03-17 13:23:56,109.109 2829:trainer.py:487 do_train_dict(): eta: 0:06:21  iter: 66200  speed: 265.3 images/sec  total_norm: 148.5136 (149.5745)  loss: 139.0148 (138.5556)  masked_loss: 1.3990 (1.3905)  tag_loss: 137.5259 (137.1651)  time: 1.4320 (1.9298)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0049)  time_gpu: 1.4269 (1.9247)  save_time: 8.8805 (13.8650)  lr: 0.000000  max mem: 26307
2022-03-17 13:23:56,470.470 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.5588235259056091
2022-03-17 13:23:56,470.470 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 130.36961364746094
2022-03-17 13:23:56,470.470 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.95701368398076
2022-03-17 13:24:29,726.726 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024217350408434868
2022-03-17 13:24:29,726.726 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 13:24:29,727.727 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', '[MASK]', 'hydra', '##nts', 'on', 'a', 'side', 'walk', 'near', 'a', 'car', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 13:24:29,742.742 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['car', 'road', 'street', 'fire', '[UNK]', 'sidewalk', 'pole', 'line', 'plate', 'tree', 'license', 'curb', 'light', 'ground', 'sign', 'chain', 'mirror', 'leaf', 'base', 'building', 'person', 'top', 'tire', 'dirt', 'window', 'motorcycle', 'windshield', 'cap', 'man', 'cover', 'city', 'sky', 'trash', 'back', 'suv', 'bumper', 'paint', 'vehicle', 'side', 'lid', 'bolt', 'van', 'tail', 'traffic', 'jacket', 'rock', 'grass', 'truck', 'head', 'bike']
2022-03-17 13:24:45,651.651 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'many', 'side', 'line', 'building', 'road', 'street', 'car', 'fire', 'ground', 'base', 'walk', 'tree', 'plate', 'license', 'pole', 'leaf', 'lid', 'sidewalk', 'tire', 'puddle']
2022-03-17 13:27:09,520.520 2829:trainer.py:487 do_train_dict(): eta: 0:03:24  iter: 66300  speed: 264.7 images/sec  total_norm: 148.9686 (151.6847)  loss: 135.4109 (138.0281)  masked_loss: 1.3913 (1.3951)  tag_loss: 133.8510 (136.6329)  time: 1.4329 (1.9341)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4278 (1.9290)  save_time: 8.8805 (13.8650)  lr: 0.000000  max mem: 26307
2022-03-17 13:27:09,881.881 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.6969696879386902
2022-03-17 13:27:09,881.881 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 134.40628051757812
2022-03-17 13:27:09,881.881 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.96192062906472
2022-03-17 13:27:42,822.822 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024204757064580917
2022-03-17 13:27:42,823.823 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 13:27:42,823.823 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', 'a', 'traffic', 'sign', 'hung', 'upside', 'down', 'on', '[MASK]', 'pole', 'fuji', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 13:27:42,839.839 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['sign', 'pole', 'arrow', 'grass', 'tree', 'road', 'letter', 'sky', 'street', 'line', 'post', 'pillar', 'car', 'leaf', 'ground', 'building', 'bridge', 'water', 'trunk', 'bush', 'sidewalk', 'hill', '[UNK]', 'shadow', 'fence', 'branch', 'stop', 'window', 'traffic', 'wire', 'background', 'curb', 'roof', 'wall', 'reflection', 'light', 'white', 'column', 'man', 'graffiti', 'way', 'next', 'person', 'highway', 'intersection', 'number', 'front', 'side', 'tire', 'park']
2022-03-17 13:27:58,824.824 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['line', 'building', 'road', 'power', 'car', 'post', 'tree', 'tower', 'letter', 'sign', 'sky', 'traffic', 'hung', 'grass', 'pole', 'leaf', 'wire', 'arrow', 'reflection']
2022-03-17 13:30:22,704.704 2829:trainer.py:487 do_train_dict(): eta: 0:00:26  iter: 66400  speed: 265.0 images/sec  total_norm: 147.0204 (150.1494)  loss: 137.1686 (139.1832)  masked_loss: 1.4167 (1.4033)  tag_loss: 135.7120 (137.7799)  time: 1.4320 (1.9318)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4267 (1.9266)  save_time: 8.8805 (13.8650)  lr: 0.000000  max mem: 26307
2022-03-17 13:30:23,064.064 2829:tagger_caption_uni_pipeline_expanding.py:404    forward(): caption acc = 0.625
2022-03-17 13:30:23,064.064 2829:tagger_caption_uni_pipeline_expanding.py:408    forward(): Tag Loss = 141.86703491210938
2022-03-17 13:30:23,064.064 2829:tagger_caption_uni_pipeline_expanding.py:409    forward(): Tag Precision. = 71.9596698445485
2022-03-17 13:30:56,296.296 2829:tagger_caption_uni_pipeline_expanding.py:413    forward(): Tag mAP: 0.024200977757573128
2022-03-17 13:30:56,297.297 2829:tagger_caption_uni_pipeline_expanding.py:418    forward(): # of tokens = 577
2022-03-17 13:30:56,297.297 2829:tagger_caption_uni_pipeline_expanding.py:421    forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'standing', 'with', 'a', 'bag', 'of', 'luggage', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
2022-03-17 13:30:56,312.312 2829:tagger_caption_uni_pipeline_expanding.py:427    forward(): Sample Generation: ['person', 'shadow', 'tree', 'head', 'man', 'shirt', 'line', 'wall', 'building', 'ground', '[UNK]', 'woman', 'eye', 'leg', 'photo', 'hair', 'road', 'jacket', 'wire', 'hat', 'pole', 'mirror', 'white', 'girl', 'hand', 'face', 'nose', 'wheel', 'sky', 'umbrella', 'boat', 'car', 'flag', 'sidewalk', 'background', 'kite', 'rope', 'bush', 'ramp', 'front', 'foot', 'floor', 'reflection', 'arm', 'shoe', 'dress', 'picture', 'ear', 'black', 'air']
2022-03-17 13:31:12,248.248 2829:tagger_caption_uni_pipeline_expanding.py:429    forward(): GT Tags: ['[UNK]', 'man', 'hand', 'building', 'ground', 'hair', 'floor', 'boy', 'foot', 'tree', 'sky', 'shirt', 'leg', 'background', 'bag', 'handle', 'shadow', 'bush', 'photo', 'pole', 'fence', 'sail', 'sidewalk', 'suitcase', 'luggage', 'hedge', 'skyscraper']
2022-03-17 13:31:33,985.985 2829:trainer.py:487 do_train_dict(): eta: 0:00:00  iter: 66415  speed: 718.3 images/sec  total_norm: 146.6914 (150.1386)  loss: 137.2871 (138.2889)  masked_loss: 1.4118 (1.4059)  tag_loss: 135.7408 (136.8830)  time: 1.4321 (1.9305)  data: 0.0001 (0.0002)  to_device: 0.0051 (0.0050)  time_gpu: 1.4270 (1.9253)  save_time: 8.8805 (13.8650)  lr: 0.000000  max mem: 26307
2022-03-17 13:31:35,075.075 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_final.pt
2022-03-17 13:32:00,524.524 2829:checkpoint.py:222       save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt
2022-03-17 13:32:08,980.980 2829:trainer.py:525 do_train_dict(): Total training time: 1 day, 8:47:40.699242 (1.7776 s / it)
2022-03-17 13:32:09,051.051 2829:qd_common.py:625    cmd_run(): start to cmd run: zip -uyrv output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/source_code * -x \*src/CCSCaffe/\* -x \*src/build/lib.linux-x86_64-2.7/\* -x \*build/lib.linux-x86_64-2.7/\* -x \*build/temp.linux-x86_64-2.7/\* -x \*build/lib.linux-x86_64-3.5/\* -x \*build/temp.linux-x86_64-3.5/\* -x \*build/lib.linux-x86_64-3.7/\* -x assets\* -x \*build/temp.linux-x86_64-3.7/\* -x \*build/lib.linux-x86_64-3.6/\* -x \*build/temp.linux-x86_64-3.6/\* -x \*src/detectron2/datasets/\* -x \*src/CCSCaffe/models/\* -x \*src/CCSCaffe/data/\* -x \*src/CCSCaffe/examples/\* -x \*src/detectron2/output\* -x aux_data/yolo9k/\* -x visualization\* -x output\* -x data\* -x \*.build_release\* -x \*.build_debug\* -x \*.build\* -x \*tmp_run\* -x \*src/CCSCaffe/MSVC/\* -x \*.pyc -x \*.so -x \*.o -x \*src/CCSCaffe/docs/tutorial/\* -x \*src/CCSCaffe/matlab/\* -x \*.git\* -x \*src/qd/mask/modeling/captioning/coco_caption\* -x \*src/qd/mask/modeling/captioning/cider/data\*
	zip warning: output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/source_code.zip not found or empty
  adding: CLIPS.ipynb	(in=24240) (out=5496) (deflated 77%)
  adding: README.md	(in=20) (out=20) (stored 0%)
  adding: T5_test.ipynb	(in=16824) (out=4410) (deflated 74%)
  adding: Untitled.ipynb 	(in=3434027) (out=2161291) (deflated 37%)
  adding: Visualization.ipynb 	(in=590003) (out=417444) (deflated 29%)
  adding: aml_job_config.json	(in=4534) (out=1814) (deflated 60%)
  adding: aux_data/	(in=0) (out=0) (stored 0%)
  adding: aux_data/configs/	(in=0) (out=0) (stored 0%)
  adding: aux_data/configs/vigblob_account.yaml	(in=596) (out=408) (deflated 32%)
  adding: aux_data/configs/azure_blob_account.yaml	(in=163) (out=148) (deflated 9%)
  adding: aux_data/configs/vigstandardblob_account.yaml	(in=300) (out=247) (deflated 18%)
  adding: aux_data/configs/others/	(in=0) (out=0) (stored 0%)
  adding: aux_data/configs/others/vigcancentralblob_account.yaml	(in=300) (out=247) (deflated 18%)
  adding: aux_data/configs/others/philly_vc.yaml	(in=2564) (out=823) (deflated 68%)
  adding: aux_data/configs/others/vigblob_account.yaml	(in=595) (out=395) (deflated 34%)
  adding: aux_data/configs/others/reditimgblob_account.yaml	(in=199) (out=170) (deflated 15%)
  adding: aux_data/configs/others/pengchuan.yaml	(in=346) (out=281) (deflated 19%)
  adding: aux_data/configs/others/vigstandardblob_account.yaml	(in=300) (out=247) (deflated 18%)
  adding: aux_data/configs/others/jfgcommentblob_account.yaml	(in=217) (out=174) (deflated 20%)
  adding: aux_data/configs/others/build_composite_dataset.yaml	(in=193) (out=134) (deflated 31%)
  adding: aux_data/configs/others/bingproductblob_account.yaml	(in=308) (out=258) (deflated 16%)
  adding: aux_data/configs/others/expid_generate.yaml	(in=30687) (out=4745) (deflated 85%)
  adding: aux_data/configs/others/cognitive_credential.yaml	(in=112) (out=101) (deflated 10%)
  adding: aux_data/configs/others/vigeastblob_account.yaml	(in=293) (out=245) (deflated 16%)
  adding: aux_data/configs/others/multi_philly_vc.yaml	(in=31) (out=28) (deflated 10%)
  adding: aux_data/configs/others/vigjpeastblob_account.yaml	(in=298) (out=248) (deflated 17%)
  adding: aux_data/configs/others/vigaueastblob_account.yaml	(in=301) (out=250) (deflated 17%)
  adding: aux_data/configs/others/eu2blob_account.yaml.backup	(in=263) (out=235) (deflated 11%)
  adding: aux_data/configs/others/TaxHardV1.yaml	(in=1556) (out=592) (deflated 62%)
  adding: aux_data/configs/others/azure_blob_account.yaml.backup	(in=133) (out=128) (deflated 4%)
  adding: aux_data/configs/others/extra_tracking_philly_jobs.yaml	(in=3) (out=3) (stored 0%)
  adding: aux_data/configs/others/vigwestblob_account.yaml	(in=605) (out=418) (deflated 31%)
  adding: aux_data/configs/others/vigsouthcenterblob_account.yaml	(in=603) (out=419) (deflated 31%)
  adding: aux_data/configs/others/xiyinwestmaskblob_account.yaml	(in=311) (out=255) (deflated 18%)
  adding: aux_data/configs/others/cogsimagestorageblob_account.yaml	(in=163) (out=148) (deflated 9%)
  adding: aux_data/configs/others/mongodb_credential.yaml	(in=52) (out=42) (deflated 19%)
  adding: aux_data/configs/vigeastblob_account.yaml	(in=293) (out=245) (deflated 16%)
  adding: aux_data/configs/xiyinwestmaskblob_account.yaml	(in=311) (out=255) (deflated 18%)
  adding: aux_data/Jacob_config/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/16-384-vitbfocal20-bert-tokenizer-expanding-distill/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill_weight_1.yaml	(in=2070) (out=866) (deflated 58%)
  adding: aux_data/Jacob_config/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill_s=8_weight_1.yaml	(in=2078) (out=871) (deflated 58%)
  adding: aux_data/Jacob_config/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill.yaml	(in=2059) (out=865) (deflated 58%)
  adding: aux_data/Jacob_config/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill_s=8-nonewtokenizer.yaml	(in=2122) (out=884) (deflated 58%)
  adding: aux_data/Jacob_config/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill_not_all_token.yaml	(in=2086) (out=872) (deflated 58%)
  adding: aux_data/Jacob_config/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill_s=8.yaml	(in=2092) (out=876) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/log.txt	(in=54093) (out=3928) (deflated 93%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml	(in=1466) (out=653) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622918567_3a7ba0fc_008.yaml	(in=1465) (out=646) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml	(in=1476) (out=654) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/log_OobjectDec.txt	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623798151_a08273e7_008.yaml	(in=1491) (out=662) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623117072_824f5561_008.yaml	(in=1468) (out=653) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623798165_bf4a05ac_008.yaml	(in=1481) (out=654) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622955881_4a8af6c7.yaml	(in=1456) (out=644) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622850281_ca891676_008.yaml	(in=1463) (out=644) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622918554_7eb3c00a.yaml	(in=1460) (out=651) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623784968_3160e6bd_008.yaml	(in=1495) (out=664) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623126708_54017d63.yaml	(in=1456) (out=643) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622918554_7eb3c00a.yaml	(in=1460) (out=645) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml	(in=1466) (out=648) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml	(in=1466) (out=653) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622850281_ca891676.yaml	(in=1454) (out=641) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml	(in=1456) (out=647) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624054505_c137d6e9.yaml	(in=1576) (out=680) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624053950_aae348f1.yaml	(in=1520) (out=659) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622918567_3a7ba0fc.yaml	(in=1456) (out=644) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml	(in=1476) (out=660) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml	(in=1456) (out=642) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622918554_7eb3c00a_008.yaml	(in=1469) (out=649) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623116972_4496aa14.yaml	(in=1461) (out=646) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622955881_4a8af6c7_008.yaml	(in=1465) (out=646) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623116931_2268c9d0,_008.yaml	(in=1464) (out=644) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_10_with_VLP_80_epochs_0.08.yaml	(in=1629) (out=694) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_Zhiyuan-PyTorch-Test.yaml	(in=1250) (out=544) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_NoVLP.yaml	(in=1391) (out=607) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624090469_bf9ca5e4.yaml	(in=1646) (out=705) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624054095_f627b37c.yaml	(in=1550) (out=668) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-6_iter_10_with_VLP_80_epochs_0.08.yaml	(in=1629) (out=694) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624092038_a6c5b171_0.8.yaml	(in=1692) (out=715) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_80_epochs_0.08.yaml	(in=1629) (out=689) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623905847_85c4023.yaml	(in=1609) (out=692) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624054505_c137d6e9.yaml	(in=1606) (out=692) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_80_epochs_0.9.yaml	(in=1618) (out=690) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624091901_159eac17_0.9.yaml	(in=1692) (out=714) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_80_epochs_0.08.yaml	(in=1629) (out=693) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624092038_a6c5b171.yaml	(in=1692) (out=712) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624090515_017cf63e.yaml	(in=1644) (out=704) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624091901_159eac17.yaml	(in=1692) (out=713) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml	(in=1466) (out=647) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml	(in=1456) (out=642) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623116972_4496aa14_008.yaml	(in=1468) (out=647) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml	(in=1476) (out=660) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml	(in=1476) (out=655) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623798711_f6a7aa89_008.yaml	(in=1483) (out=655) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623126708_54017d63_008.yaml	(in=1465) (out=645) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml	(in=1476) (out=660) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623117072_824f5561.yaml	(in=1461) (out=651) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml	(in=1466) (out=653) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622918554_7eb3c00a_008.yaml	(in=1469) (out=654) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml	(in=1476) (out=655) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623116931_2268c9d0.yaml	(in=1457) (out=644) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml	(in=1456) (out=648) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623117072_824f5561.yaml	(in=1461) (out=645) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622850950_2792e5e4.yaml	(in=1448) (out=636) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml	(in=1456) (out=642) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623784986_6e927e3b_008.yaml	(in=1485) (out=658) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml	(in=1456) (out=648) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml	(in=1466) (out=648) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622850950_2792e5e4_0.08.yaml	(in=1457) (out=639) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_1e-4_iter_60_without_VLP_multiscale_112_64.yaml	(in=1362) (out=569) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_2e-4_iter_60_without_VLP_multiscale_112_64.yaml	(in=1360) (out=571) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_60_without_VLP_multiscale_192_96.yaml	(in=1358) (out=572) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP_multiscale_192_96_token_sample_378.yaml	(in=1437) (out=589) (deflated 59%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP_multiscale_192_96.yaml	(in=1358) (out=569) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/others/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_60_without_VLP_multiscale_192_96.yaml	(in=1414) (out=607) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/others/Vilt_captioning_testing_batch-size_128_encoder_vit_base_patch16_384_lr_2e-5_iter_60_without_VLP_multiscale_192_96.yaml	(in=1337) (out=567) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/others/Vilt_captioning_testing_batch-size_128_encoder_vit_base_patch16_384_lr_5e-5_iter_60_without_VLP_multiscale_192_96.yaml	(in=1337) (out=567) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/others/Vilt_captioning_testing_batch-size_128_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP_multiscale_192_96_small_scale_0.9.yaml	(in=1334) (out=564) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_5e-5_iter_60_without_VLP_multiscale_112_64.yaml	(in=1339) (out=569) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_3e-4_iter_60_without_VLP_multiscale_112_64.yaml	(in=1337) (out=567) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_2e-4_iter_60_without_VLP_multiscale_192_96.yaml	(in=1362) (out=572) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_60_without_VLP_multiscale_192_96.yaml	(in=1433) (out=606) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_3e-4_iter_60_without_VLP_multiscale_192_96.yaml	(in=1433) (out=608) (deflated 58%)
  adding: aux_data/Jacob_config/coco_captioning/others/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2545.yaml	(in=977) (out=496) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_exp-n_100distill_iou_i2it2iatt.yaml	(in=1114) (out=538) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_B_CapS_BS512_MaxIter0e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base63882_3aaa9.yaml	(in=922) (out=449) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2462.yaml	(in=1052) (out=528) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2443.yaml	(in=1058) (out=532) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2482.yaml	(in=1052) (out=530) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_938.yaml	(in=1000) (out=506) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_30_without_VLP.yaml	(in=1191) (out=532) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_minilm_i2i_t2i.yaml	(in=1125) (out=542) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f.yaml	(in=976) (out=496) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_965.yaml	(in=989) (out=501) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill.yaml	(in=1060) (out=516) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_30_with_VLP_exp2.yaml	(in=1360) (out=626) (deflated 54%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_Jianfeng_Best_MiniVLM_LR5e-5.yaml	(in=885) (out=478) (deflated 46%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_812.yaml	(in=1045) (out=540) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR1e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml	(in=961) (out=474) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_972.yaml	(in=1062) (out=548) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_exp-n_10distill_iou_i2it2iatt.yaml	(in=1112) (out=532) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_976.yaml	(in=1001) (out=507) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2464.yaml	(in=1052) (out=528) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_minilm.yaml	(in=1059) (out=519) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_845.yaml	(in=1051) (out=529) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter40e_LR0.0001_exp1.yaml	(in=859) (out=442) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_TaxCOCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml	(in=1041) (out=514) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed_exp11.yaml	(in=1166) (out=549) (deflated 53%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed.yaml	(in=1139) (out=542) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2578.yaml	(in=1056) (out=543) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_230.yaml	(in=1060) (out=547) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_5737.yaml	(in=991) (out=501) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR2e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml	(in=961) (out=474) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_945.yaml	(in=1064) (out=548) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2432.yaml	(in=1052) (out=532) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_sparseatt_chamfer_queue.yaml	(in=1065) (out=528) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp761.yaml	(in=1064) (out=533) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_5737.yaml	(in=991) (out=501) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp24.yaml	(in=1062) (out=532) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp750.yaml	(in=1036) (out=529) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2484.yaml	(in=1052) (out=530) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter40e_LR5e-06_WD0.05_Feff0f_Leff0f_2479.yaml	(in=1058) (out=555) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_213.yaml	(in=976) (out=489) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2477.yaml	(in=1052) (out=530) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_60_without_VLP_scale_0.08_jianfeng.yaml	(in=1386) (out=636) (deflated 54%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert.yaml	(in=1049) (out=516) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_60_without_VLP_scale_0.9_jianfeng.yaml	(in=1411) (out=650) (deflated 54%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_B_CapS_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base6.yaml	(in=935) (out=489) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp19.yaml	(in=981) (out=490) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-4_iter_30_without_VLP.yaml	(in=1191) (out=531) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001.yaml	(in=827) (out=429) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_visualatt_learnable.yaml	(in=1127) (out=537) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_10i2iatt_visrel.yaml	(in=1118) (out=530) (deflated 53%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed_exp12.yaml	(in=1172) (out=552) (deflated 53%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_CC_VLPs_JFTEE_BS512_MaxIter40e_LR5e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN.yaml	(in=967) (out=484) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp19.yaml	(in=981) (out=489) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter40e_LR0.0001.yaml	(in=827) (out=429) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_i2t.yaml	(in=1064) (out=521) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_Jianfeng_Best_MiniVLM_LR5e-6.yaml	(in=889) (out=480) (deflated 46%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_vlp1267.yaml	(in=1398) (out=649) (deflated 54%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_exp-n_10distill_iou_i2it2iatt.yaml	(in=1112) (out=537) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml	(in=1024) (out=518) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill.yaml	(in=1066) (out=519) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_60_without_VLP_jianfeng.yaml	(in=1418) (out=652) (deflated 54%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2490.yaml	(in=1052) (out=528) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-06_WD0.05_Feff0f_Leff0f_Tie_ImgLN_793.yaml	(in=1025) (out=529) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_OSCARb_7M_VLPs_PETER_BS512_MaxIter20e_LR1e-05_WD0.05_Fpeter_Lpeter.yaml	(in=976) (out=502) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden.yaml	(in=1127) (out=536) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_1038.yaml	(in=1035) (out=529) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_VLPs_TaxGoogleCC64split_MiniVLM_LR5e-6.yaml	(in=875) (out=460) (deflated 47%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_60_without_VLP_scale_0.08.yaml	(in=1184) (out=520) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR5e-06_WD0.05_Feff0f_Leff0f_815.yaml	(in=1052) (out=545) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed_exp18.yaml	(in=1139) (out=539) (deflated 53%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_890.yaml	(in=1062) (out=547) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_OSCARb_7M_VLPs_PETER_BS512_MaxIter20e_LR5e-05_WD0.05_Fpeter_Lpeter.yaml	(in=976) (out=502) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_832.yaml	(in=1050) (out=529) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_5737.yaml	(in=991) (out=500) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_934.yaml	(in=1001) (out=507) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_995.yaml	(in=1059) (out=546) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR5e-06_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml	(in=961) (out=474) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_MiniLM.yaml	(in=1061) (out=523) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_977.yaml	(in=1001) (out=507) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2442.yaml	(in=1059) (out=533) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_text_align.yaml	(in=1097) (out=528) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-06_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml	(in=1024) (out=519) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_i2t.yaml	(in=1064) (out=522) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter40e_LR0.0001_exp2.yaml	(in=859) (out=443) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_TaxCOCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20_noalign.yaml	(in=1055) (out=512) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp762.yaml	(in=1064) (out=534) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter40e_LR5e-06_WD0.05_Feff0f_Leff0f_217.yaml	(in=1044) (out=543) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_i2t_th.yaml	(in=1089) (out=534) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_768.yaml	(in=981) (out=487) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2420.yaml	(in=1060) (out=531) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_i2t_th.yaml	(in=1089) (out=535) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2585.yaml	(in=1060) (out=546) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_2e-4_iter_60_without_VLP.yaml	(in=1197) (out=530) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_visualatt_learnable.yaml	(in=1127) (out=536) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_woemb.yaml	(in=1077) (out=524) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter40e_LR4e-05_WD0.05_Feff0f_Leff0f_.yaml	(in=986) (out=452) (deflated 54%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_MiniLM.yaml	(in=1061) (out=523) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20 (copy).yaml	(in=1026) (out=519) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp752.yaml	(in=1058) (out=530) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_30_with_VLP_exp3.yaml	(in=1347) (out=617) (deflated 54%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_950.yaml	(in=994) (out=502) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_888.yaml	(in=1045) (out=539) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_minilm.yaml	(in=1049) (out=516) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_CC_VLPs_JFTEE_BS512_MaxIter40e_LR1e-03_WD0.05_Feff0f_Leff0f_Tie_ImgLN.yaml	(in=969) (out=487) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-4_iter_60_without_VLP.yaml	(in=1191) (out=531) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_1029.yaml	(in=1062) (out=546) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2489.yaml	(in=1052) (out=530) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_logit.yaml	(in=1075) (out=523) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2422.yaml	(in=1060) (out=531) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_10i2iatt_sinvisrel.yaml	(in=1127) (out=536) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_2e-4_iter_60_without_VLP.yaml	(in=1197) (out=530) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp750.yaml	(in=1036) (out=530) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/SCRATCH_CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml	(in=900) (out=451) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml	(in=961) (out=474) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2491.yaml	(in=1052) (out=529) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_218.yaml	(in=1047) (out=542) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_842.yaml	(in=1054) (out=550) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_sparseatt.yaml	(in=1045) (out=519) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_1e-4_iter_30_without_VLP.yaml	(in=1197) (out=528) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_sparseatt_chamfer.yaml	(in=1049) (out=525) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_weightedfeat.yaml	(in=1082) (out=524) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter40e_LR0.0001_exp3.yaml	(in=859) (out=444) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_940.yaml	(in=1001) (out=506) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt.yaml	(in=1030) (out=510) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_VLPs_TaxGoogleCC64split_MiniVLM_LR5e-5.yaml	(in=875) (out=460) (deflated 47%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_979.yaml	(in=1001) (out=507) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_973.yaml	(in=1062) (out=549) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_947.yaml	(in=1062) (out=547) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp26.yaml	(in=1062) (out=532) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp25.yaml	(in=1062) (out=532) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill.yaml	(in=1060) (out=516) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_894.yaml	(in=990) (out=500) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed_exp16.yaml	(in=1145) (out=541) (deflated 53%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_B_CapS_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base6.yaml	(in=945) (out=498) (deflated 47%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_exp2.yaml	(in=1387) (out=642) (deflated 54%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_60_without_VLP.yaml	(in=1191) (out=532) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml	(in=1024) (out=518) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_895.yaml	(in=1050) (out=542) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_231.yaml	(in=1057) (out=547) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_TaxCOCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20_noalign.yaml	(in=1055) (out=514) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_793.yaml	(in=1025) (out=529) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_942.yaml	(in=1001) (out=507) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp22.yaml	(in=1056) (out=527) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_971.yaml	(in=1061) (out=548) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden.yaml	(in=1127) (out=536) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR5e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp19.yaml	(in=981) (out=489) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2485.yaml	(in=993) (out=503) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp745.yaml	(in=1037) (out=519) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP.yaml	(in=1197) (out=530) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR1e-03_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml	(in=1024) (out=518) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR5e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml	(in=1024) (out=519) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_841.yaml	(in=1055) (out=550) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR5e-06_WD0.05_Feff0f_Leff0f.yaml	(in=976) (out=496) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_TaxCOCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml	(in=1041) (out=514) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-06_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp19.yaml	(in=981) (out=490) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_215.yaml	(in=1054) (out=549) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_889.yaml	(in=1050) (out=542) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_978.yaml	(in=1001) (out=507) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_815.yaml	(in=1052) (out=546) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_queue128.yaml	(in=1061) (out=543) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp745.yaml	(in=1037) (out=519) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert.yaml	(in=1059) (out=520) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_979_eval.yaml	(in=982) (out=445) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_exp-n_100distill_iou_i2it2iatt.yaml	(in=1114) (out=534) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_queue1.yaml	(in=1056) (out=539) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_entireseq.yaml	(in=1029) (out=513) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR1e-06_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml	(in=961) (out=475) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_970.yaml	(in=1061) (out=547) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter40e_LR0.00001.yaml	(in=827) (out=430) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_238.yaml	(in=1059) (out=551) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp19.yaml	(in=983) (out=490) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_30_with_VLP_cust298.yaml	(in=1384) (out=639) (deflated 54%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_minilm_i2i_t2i.yaml	(in=1125) (out=543) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed_exp13.yaml	(in=1172) (out=552) (deflated 53%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_queue512.yaml	(in=1062) (out=542) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_824.yaml	(in=1055) (out=552) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_minilm.yaml	(in=1094) (out=531) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml	(in=996) (out=499) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_minilm.yaml	(in=1101) (out=534) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_975.yaml	(in=1000) (out=506) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align.yaml	(in=1101) (out=536) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_941.yaml	(in=1001) (out=507) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_OSCARb_7M_VLPs_PETER_BS512_MaxIter20e_LR5e-06_WD0.05_Fpeter_Lpeter.yaml	(in=976) (out=502) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_10i2iatt_sinvisrel.yaml	(in=1127) (out=535) (deflated 53%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp22.yaml	(in=1056) (out=528) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2479.yaml	(in=1058) (out=554) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed.yaml	(in=1129) (out=538) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_30_with_VLP_exp1.yaml	(in=1343) (out=615) (deflated 54%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_text_align.yaml	(in=1097) (out=527) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_B_CapS_BS512_MaxIter30e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base6.yaml	(in=945) (out=498) (deflated 47%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_1020.yaml	(in=1061) (out=547) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_woemb.yaml	(in=1077) (out=524) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2586.yaml	(in=1059) (out=545) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR5e-06_WD0.05_Feff0f_Leff0f_2479.yaml	(in=1058) (out=555) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_827.yaml	(in=1050) (out=529) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_10i2iatt_visrel.yaml	(in=1118) (out=530) (deflated 53%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_960.yaml	(in=1041) (out=546) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_944.yaml	(in=1061) (out=546) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_893.yaml	(in=991) (out=500) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_30_with_VLP_vlp1267.yaml	(in=1398) (out=649) (deflated 54%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_cust296.yaml	(in=1394) (out=643) (deflated 54%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter40e_LR5e-06_WD0.05_Feff0f_Leff0f_218.yaml	(in=1045) (out=543) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_weightedfeat.yaml	(in=1082) (out=524) (deflated 52%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_959.yaml	(in=1040) (out=547) (deflated 47%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2421.yaml	(in=1060) (out=532) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_queue64.yaml	(in=1059) (out=540) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_Oscar_TEE_Cap_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml	(in=973) (out=492) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_943.yaml	(in=1064) (out=548) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_OSCARb_7M_VLPs_PETER_BS512_MaxIter20e_LR5e-06_WD0.05_Fpeter_Lpeter2.yaml	(in=968) (out=499) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align.yaml	(in=1101) (out=535) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2458.yaml	(in=1053) (out=546) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter40e_LR4e-05_WD0.05_Feff0f_Leff0f.yaml	(in=976) (out=496) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter40e_LR0.0005.yaml	(in=827) (out=430) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_831.yaml	(in=1050) (out=528) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed_exp10.yaml	(in=1155) (out=541) (deflated 53%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR4e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml	(in=961) (out=475) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_946.yaml	(in=1062) (out=547) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter40e_LR0.0001_exp6.yaml	(in=859) (out=445) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_30_with_VLP_cust296.yaml	(in=1394) (out=643) (deflated 54%)
  adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.0001.yaml	(in=827) (out=429) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_logit.yaml	(in=1075) (out=523) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2444.yaml	(in=1056) (out=532) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2434.yaml	(in=1047) (out=528) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml	(in=965) (out=478) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_916.yaml	(in=1064) (out=548) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml	(in=1026) (out=519) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR6e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml	(in=961) (out=475) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_1e-4_iter_60_without_VLP.yaml	(in=1197) (out=528) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_844.yaml	(in=991) (out=499) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_cust298.yaml	(in=1384) (out=640) (deflated 54%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp752.yaml	(in=1058) (out=530) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_queue1024.yaml	(in=1065) (out=541) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter30e_LR2e-05_WD0.05_Feff0f_Leff0f_2479.yaml	(in=1058) (out=555) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml	(in=961) (out=475) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter20e_LR5e-06_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml	(in=961) (out=475) (deflated 51%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml	(in=1813) (out=791) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519564_da993a57_0.08.yaml	(in=1824) (out=790) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/log.txt	(in=53920) (out=3867) (deflated 93%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml	(in=1814) (out=783) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml	(in=1803) (out=781) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml	(in=1814) (out=785) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml	(in=1793) (out=776) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml	(in=1803) (out=785) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml	(in=1803) (out=784) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml	(in=1793) (out=777) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml	(in=1793) (out=776) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/priori/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml	(in=1862) (out=803) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml	(in=1852) (out=798) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622556377_1e410b29.yaml	(in=1838) (out=786) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml	(in=1872) (out=806) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622556174_f5b7f243.yaml	(in=1851) (out=795) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml	(in=1804) (out=777) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml	(in=1824) (out=790) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml	(in=1824) (out=789) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml	(in=1813) (out=789) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml	(in=1804) (out=777) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519657_c6f06096_0.08.yaml	(in=1804) (out=778) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml	(in=1813) (out=790) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519596_38814551_0.08.yaml	(in=1814) (out=783) (deflated 57%)
  adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_1e-5_withaugatvlpfinetune_.yaml	(in=1014) (out=508) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_5e-5_withaugatvlpfinetune.yaml	(in=957) (out=492) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_1e-4_withaugatfinetune.yaml	(in=944) (out=487) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_4e-4_withaugatfinetune.yaml	(in=949) (out=488) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_4e-4.yaml	(in=945) (out=487) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_1e-4.yaml	(in=940) (out=486) (deflated 48%)
  adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_1e-4_withaugatvlpfinetune_.yaml	(in=959) (out=493) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_1e-4_withaugatvlpfinetune.yaml	(in=959) (out=493) (deflated 49%)
  adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_5e-6_withaugatvlpfinetune.yaml	(in=1012) (out=508) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_1e-4_withaugatvlpfinetune.yaml	(in=1012) (out=507) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_1e-5_withaugatvlpfinetune.yaml	(in=1012) (out=507) (deflated 50%)
  adding: aux_data/Jacob_config/coco_captioning/after_VLP/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_captioning/after_VLP/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_20_multiscale_192_96.yaml	(in=1466) (out=663) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/after_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_20_VLP-Distill.yaml	(in=1404) (out=631) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/after_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1.5e-4_iter_20_VLP-Distill.yaml	(in=1408) (out=629) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/after_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_20_VLP.yaml	(in=1390) (out=618) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/after_VLP/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_20_multiscale_192_96_smallscale_0.08.yaml	(in=1499) (out=666) (deflated 56%)
  adding: aux_data/Jacob_config/coco_captioning/after_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_20_VLP-Distill.yaml	(in=1404) (out=626) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/after_VLP/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-6_iter_20_multiscale_192_96.yaml	(in=1466) (out=662) (deflated 55%)
  adding: aux_data/Jacob_config/coco_captioning/test.yaml	(in=1228) (out=597) (deflated 51%)
  adding: aux_data/Jacob_config/resume/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/resume/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter80_s2s_multi_scale_crop_1_2_4.yaml	(in=2166) (out=815) (deflated 62%)
  adding: aux_data/Jacob_config/resume/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter80_s2s_multi_scale_crop_1_1_1.yaml	(in=2109) (out=797) (deflated 62%)
  adding: aux_data/Jacob_config/resume/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.9_10logit.yaml	(in=2272) (out=838) (deflated 63%)
  adding: aux_data/Jacob_config/vqa_distill/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/vqa_distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch16_224_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_71_1ce_1logit_0hid.yaml	(in=1898) (out=824) (deflated 57%)
  adding: aux_data/Jacob_config/vqa_distill/Kim_Distill/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/vqa_distill/Kim_Distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Teacher_vinvl_large_1ce_1logit.yaml	(in=1711) (out=731) (deflated 57%)
  adding: aux_data/Jacob_config/vqa_distill/Kim_Distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Teacher_vinvl_large_0ce_1logit.yaml	(in=1711) (out=731) (deflated 57%)
  adding: aux_data/Jacob_config/vqa_distill/Kim_Distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Teacher_vinvl_large_0ce_10logit.yaml	(in=1714) (out=731) (deflated 57%)
  adding: aux_data/Jacob_config/vqa_distill/Kim_Distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Teacher_vinvl_large_1ce_10logit.yaml	(in=1714) (out=732) (deflated 57%)
  adding: aux_data/Jacob_config/vqa_distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch16_224_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_71_1ce_0logit_10hid.yaml	(in=1901) (out=825) (deflated 57%)
  adding: aux_data/Jacob_config/vqa_distill/Zhiyuan-PyTorch-Test_1620849955_7efc3f53_test.yaml	(in=2051) (out=837) (deflated 59%)
  adding: aux_data/Jacob_config/vqa_distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch16_224_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_71_1ce_1logit_1hid.yaml	(in=1898) (out=822) (deflated 57%)
  adding: aux_data/Jacob_config/vqa_distill/Zhiyuan-PyTorch-Test_1620849955_7efc3f53.yaml	(in=2052) (out=835) (deflated 59%)
  adding: aux_data/Jacob_config/vqa_distill/Zhiyuan-PyTorch-Test_1620847155_ada34def.yaml	(in=2052) (out=835) (deflated 59%)
  adding: aux_data/Jacob_config/vqa_distill/Zhiyuan-PyTorch-Test_1620849935_34a03392.yaml	(in=2052) (out=835) (deflated 59%)
  adding: aux_data/Jacob_config/vqa_distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch16_224_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_71_1ce_0logit_0hid.yaml	(in=1898) (out=823) (deflated 57%)
  adding: aux_data/Jacob_config/cc_captioning/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/cc_captioning/Google-CC_Vilt_captioning_testing_batch-size_256_ENC_DEC_vit_base_patch16_384_lr_1e-4_iter_120_without_VLP.yaml	(in=1774) (out=715) (deflated 60%)
  adding: aux_data/Jacob_config/cc_captioning/Google-CC_Vilt_captioning_testing_batch-size_256_ENC_DEC_vit_base_patch16_384_lr_1e-4_iter_120_without_VLP_test.yaml	(in=1792) (out=721) (deflated 60%)
  adding: aux_data/Jacob_config/cc_captioning/Google-CC_Vilt_captioning_testing_batch-size_16_encoder_vit_base_patch16_384_lr_1e-4_iter_120_with_VLP_distillation.yaml	(in=2122) (out=901) (deflated 58%)
  adding: aux_data/Jacob_config/cc_captioning/Vilt_cc_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Jacob_Vilt_CC_captioning_val_testing_batch-size_512_encoder_vit_base_patch32_384_with_VLP_Zhiyuan-PyTorch-Test_1623116931_2268c9d0.yaml	(in=1383) (out=568) (deflated 59%)
  adding: aux_data/Jacob_config/cc_captioning/vinvl_large_CC_caption_uni_batch-size_1024_lr_5e-5_iter_30.yaml	(in=1163) (out=523) (deflated 55%)
  adding: aux_data/Jacob_config/cc_captioning/vinvl_CC_CIDER_caption_uni_batch-size_64_lr_5e-5_iter_5.yaml	(in=1152) (out=526) (deflated 54%)
  adding: aux_data/Jacob_config/cc_captioning/Vilt_cc_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Jacob_Vilt_CC_captioning_val_testing_batch-size_512_encoder_vit_base_patch32_384_with_VLP_Zhiyuan-PyTorch-Test_1622850950_2792e5e4 .yaml	(in=1362) (out=562) (deflated 59%)
  adding: aux_data/Jacob_config/cc_captioning/vinvl_CC_caption_uni_batch-size_1024_lr_5e-5_iter_30.yaml	(in=1107) (out=503) (deflated 55%)
  adding: aux_data/Jacob_config/cc_captioning/vinvl_CC_caption_uni_batch-size_512_lr_5e-5_iter_10.yaml	(in=1104) (out=502) (deflated 55%)
  adding: aux_data/Jacob_config/cc_captioning/Google-CC_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_with_vlp.yaml	(in=2081) (out=880) (deflated 58%)
  adding: aux_data/Jacob_config/cc_captioning/Google-CC_Vilt_captioning_testing_batch-size_16_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_2_4.yaml	(in=1802) (out=712) (deflated 60%)
  adding: aux_data/Jacob_config/NoCaps/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6_val_cbs.yaml	(in=1989) (out=829) (deflated 58%)
  adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ViTCAP_NO-VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6_val_no-cbs.yaml	(in=2015) (out=799) (deflated 60%)
  adding: aux_data/Jacob_config/NoCaps/cider_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_256_encoder_vit_base_patch16_384_without_VLP_multi_scale_fix_BS_64_Sample_276.yaml	(in=1588) (out=640) (deflated 60%)
  adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ViTCAP_NO-VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6_val_cbs.yaml	(in=2038) (out=800) (deflated 61%)
  adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6.yaml	(in=1715) (out=744) (deflated 57%)
  adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ViTCAP_NO-VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6_val_cbs_cider.yaml	(in=2128) (out=889) (deflated 58%)
  adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6_val_no-cbs.yaml	(in=1990) (out=828) (deflated 58%)
  adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6_val_cbs2.yaml	(in=1989) (out=829) (deflated 58%)
  adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6_val.yaml	(in=1965) (out=818) (deflated 58%)
  adding: aux_data/Jacob_config/VQA/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VQA/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_5e-4_iter_40_small_0.08_without_VLP.yaml	(in=1357) (out=586) (deflated 57%)
  adding: aux_data/Jacob_config/VQA/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_2e-4_iter_40_small_0.8_without_VLP.yaml	(in=1354) (out=586) (deflated 57%)
  adding: aux_data/Jacob_config/VQA/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP.yaml	(in=1360) (out=588) (deflated 57%)
  adding: aux_data/Jacob_config/VQA/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.08_without_VLP.yaml	(in=1357) (out=584) (deflated 57%)
  adding: aux_data/Jacob_config/VQA/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_5e-5_iter_40_small_0.08_without_VLP.yaml	(in=1357) (out=584) (deflated 57%)
  adding: aux_data/Jacob_config/VQA/VLP/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml	(in=1572) (out=689) (deflated 56%)
  adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622556174_f5b7f243_20epoch.yaml	(in=1577) (out=689) (deflated 56%)
  adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622556377_1e410b29_20epoch.yaml	(in=1564) (out=678) (deflated 57%)
  adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096_20epoch.yaml	(in=1578) (out=692) (deflated 56%)
  adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57_20epoch.yaml	(in=1598) (out=701) (deflated 56%)
  adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622556174_f5b7f243.yaml	(in=1561) (out=684) (deflated 56%)
  adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml	(in=1582) (out=695) (deflated 56%)
  adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551_20epoch.yaml	(in=1588) (out=695) (deflated 56%)
  adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml	(in=1562) (out=688) (deflated 56%)
  adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622556377_1e410b29.yaml	(in=1548) (out=673) (deflated 57%)
  adding: aux_data/Jacob_config/VQA/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_2e-4_iter_40_small_0.08_without_VLP.yaml	(in=1357) (out=586) (deflated 57%)
  adding: aux_data/Jacob_config/others/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/others/0ce_1logit.json .	(in=15792897) (out=616979) (deflated 96%)
  adding: aux_data/Jacob_config/others/vilt/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/others/vilt/vqa_uni_pipeline.yaml	(in=1713) (out=589) (deflated 66%)
  adding: aux_data/Jacob_config/others/vilt/vlp_vilt.yaml	(in=2523) (out=818) (deflated 68%)
  adding: aux_data/Jacob_config/others/vilt/ignore_pattern.yaml	(in=75) (out=49) (deflated 35%)
  adding: aux_data/Jacob_config/others/vilt/waiting_then_vilt_vqa.yaml	(in=3238) (out=942) (deflated 71%)
  adding: aux_data/Jacob_config/others/vilt/caption_uni_pipeline.yaml	(in=1494) (out=467) (deflated 69%)
  adding: aux_data/Jacob_config/others/vilt/vilt_vqa_uni_pipeline.yaml	(in=2218) (out=713) (deflated 68%)
  adding: aux_data/Jacob_config/others/vilt/waiting_then_vilt_caption.yaml	(in=2531) (out=751) (deflated 70%)
  adding: aux_data/Jacob_config/others/vilt/caption_uni_pipeline_debug.yaml	(in=1491) (out=465) (deflated 69%)
  adding: aux_data/Jacob_config/others/caption_uni_pipeline_teacher.yaml	(in=1174) (out=509) (deflated 57%)
  adding: aux_data/Jacob_config/others/VG-SGG-dicts-vgoi6-clipped.json 	(in=108739) (out=122) (deflated 100%)
  adding: aux_data/Jacob_config/others/my.png 	(in=254733) (out=254741) (deflated 0%)
  adding: aux_data/Jacob_config/kim_vilt/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/kim_vilt/Kim_Vilt_CC_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP.yaml	(in=1351) (out=599) (deflated 56%)
  adding: aux_data/Jacob_config/kim_vilt/Kim_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP.yaml	(in=1350) (out=592) (deflated 56%)
  adding: aux_data/Jacob_config/kim_vilt/Kim_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-5_iter_30_with_VLP.yaml	(in=1353) (out=589) (deflated 56%)
  adding: aux_data/Jacob_config/kim_vilt/Kim_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-4_iter_30_with_VLP.yaml	(in=1350) (out=592) (deflated 56%)
  adding: aux_data/Jacob_config/kim_vilt/Kim_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_5e-6_iter_30_with_VLP.yaml	(in=1353) (out=593) (deflated 56%)
  adding: aux_data/Jacob_config/kim_vilt/Kim_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_5_with_VLP_SCST.yaml	(in=1516) (out=630) (deflated 58%)
  adding: aux_data/Jacob_config/kim_vilt/Kim_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP.yaml	(in=1353) (out=592) (deflated 56%)
  adding: aux_data/Jacob_config/kim_vilt/vqa/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_eval.yaml	(in=1305) (out=587) (deflated 55%)
  adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_5e-5_iter_10_with_VLP.yaml	(in=1352) (out=595) (deflated 56%)
  adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_kim_classifier_.yaml	(in=1397) (out=607) (deflated 57%)
  adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP.yaml	(in=1353) (out=598) (deflated 56%)
  adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_kim_classifier_dropout_0.1.yaml	(in=1397) (out=606) (deflated 57%)
  adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP.yaml	(in=1353) (out=597) (deflated 56%)
  adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_60_with_VLP.yaml	(in=1353) (out=598) (deflated 56%)
  adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_eval2.yaml	(in=1326) (out=592) (deflated 55%)
  adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_kim_classifier_dropout_0.3.yaml	(in=1381) (out=603) (deflated 56%)
  adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP.yaml	(in=1352) (out=596) (deflated 56%)
  adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_kim_classifier.yaml	(in=1373) (out=600) (deflated 56%)
  adding: aux_data/Jacob_config/ViTCAP/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk_singlenode_test.yaml	(in=2055) (out=866) (deflated 58%)
  adding: aux_data/Jacob_config/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk-4nodetest.yaml	(in=1897) (out=813) (deflated 57%)
  adding: aux_data/Jacob_config/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk.yaml	(in=2036) (out=860) (deflated 58%)
  adding: aux_data/Jacob_config/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk-singlenodetest.yaml	(in=1896) (out=813) (deflated 57%)
  adding: aux_data/Jacob_config/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk_nodistill.yaml	(in=2051) (out=860) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-5_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_1logit_0hid_batch_size_4096_iter7_resume.yaml	(in=2054) (out=820) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_1logit_0hid_batch_size_4096_iter20.yaml	(in=1861) (out=785) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_0logit_0hid_batch_size_512_iter20.yaml	(in=1858) (out=783) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_10hid_batch_size_4096_iter20_FCDistill_AllHidden.yaml	(in=1773) (out=727) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_10hid_batch_size_4096_iter40_FCDistill_AllHidden.yaml	(in=1771) (out=726) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_batch_size_2048_iter10.yaml	(in=1701) (out=701) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_10hid_batch_size_4096_iter10_FCDistill_AllHidden.yaml	(in=1771) (out=727) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_10hid_batch_size_1024_iter20_FCDistill_AllHidden_debug.yaml	(in=1798) (out=732) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_10hid_batch_size_4096_iter10.yaml	(in=1731) (out=713) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_4096_iter10.yaml	(in=1728) (out=714) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_10hid_batch_size_4096_iter10_FCDistill.yaml	(in=1755) (out=725) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_batch_size_4096_iter10.yaml	(in=1701) (out=702) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_10hid_batch_size_4096_iter100_FCDistill_AllHidden.yaml	(in=1774) (out=728) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_0logit_0hid_batch_size_128_iter20.yaml	(in=1820) (out=781) (deflated 57%)
  adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_1logit_0hid_batch_size_1024_iter20.yaml	(in=1861) (out=783) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_1logit_0hid_batch_size_512_iter20.yaml	(in=1858) (out=783) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_1logit_1hid_batch_size_4096_iter20.yaml	(in=1861) (out=783) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_0logit_0hid_batch_size_2048_iter20.yaml	(in=1861) (out=782) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_1logit_0hid_batch_size_2048_iter20.yaml	(in=1861) (out=782) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_4096_iter10_s2s.yaml	(in=1792) (out=739) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1hid_batch_size_4096_iter10_s2s.yaml	(in=1785) (out=736) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/priori/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10hid_batch_size_4096_iter10_s2s_FC_all_token.yaml	(in=1842) (out=747) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_2048_iter10_s2s_finetuned.yaml	(in=1963) (out=822) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter100_s2s_multi_scale.yaml	(in=1888) (out=763) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_4096_iter10_s2s_FC.yaml	(in=1825) (out=740) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10hid_batch_size_4096_iter10_s2s_FC.yaml	(in=1821) (out=739) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_10hid_batch_size_4096_iter10_s2s_FC.yaml	(in=1807) (out=751) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_0ce_10logit_batch_size_4096_iter10_s2s.yaml	(in=1792) (out=738) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_1hid_batch_size_4096_iter10_s2s.yaml	(in=1798) (out=738) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10hid_batch_size_4096_iter10_s2s_FC.yaml	(in=1793) (out=740) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_batch_size_4096_iter10_s2s.yaml	(in=1789) (out=737) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1logit_10hid_batch_size_4096_iter10_s2s.yaml	(in=1759) (out=725) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_4096_iter10_s2s.yaml	(in=1747) (out=724) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_10_hid_batch_size_2048_iter10_s2s.yaml	(in=1756) (out=723) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_batch_size_2048_iter40_s2s.yaml	(in=1744) (out=721) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1logit_10hid_batch_size_4096_iter10_s2s_FC.yaml	(in=1764) (out=728) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_batch_size_2048_iter10_s2s.yaml	(in=1732) (out=716) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_sample_0.9.yaml	(in=1991) (out=774) (deflated 61%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale.yaml	(in=1845) (out=739) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale.yaml	(in=1845) (out=740) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_0logit_10hid_batch_size_4096_iter10_s2s_FC_alltoken.yaml	(in=1838) (out=751) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter100_s2s_multi_scale.yaml	(in=1848) (out=739) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter100_s2s_multi_scale_crop.yaml	(in=1950) (out=762) (deflated 61%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_sample_0.6.yaml	(in=1991) (out=776) (deflated 61%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop.yaml	(in=1888) (out=749) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_crop_4_8.yaml	(in=1947) (out=761) (deflated 61%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_0logit_10hid_batch_size_4096_iter10_s2s_FC.yaml	(in=1817) (out=742) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_sample_0.8.yaml	(in=1991) (out=776) (deflated 61%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_crop_2_4.yaml	(in=1947) (out=761) (deflated 61%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_4096_iter10_s2s_FC.yaml	(in=1805) (out=739) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_1024_iter100_s2s_multi_scale.yaml	(in=1851) (out=740) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/gumbel/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/gumbel/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.8_10logit_continue_2epoch.yaml	(in=2297) (out=850) (deflated 63%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/gumbel/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.9.yaml	(in=2012) (out=783) (deflated 61%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/gumbel/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.9_10logit.yaml	(in=2028) (out=787) (deflated 61%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/gumbel/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.8.yaml	(in=2019) (out=788) (deflated 61%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/gumbel/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.8_10logit.yaml	(in=2035) (out=792) (deflated 61%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter80_s2s_multi_scale_crop_1_2_4.yaml	(in=1924) (out=759) (deflated 61%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter80_s2s_multi_scale_crop_1_2_4.yaml	(in=1921) (out=758) (deflated 61%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter40_s2s_16_384_small_0.08.yaml	(in=1856) (out=746) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_ENC_DEC_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_1024_iter80_s2s.yaml	(in=1963) (out=787) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_vinvl_large_4M_seq2seq_uni_batch-size_1024_lr_1e-4_iter_80.yaml	(in=1170) (out=541) (deflated 54%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_ENC_DEC_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter80_s2s.yaml	(in=1960) (out=788) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter80_s2s_multi_scale_crop_1_1_1.yaml	(in=1888) (out=754) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter10_s2s_multi_scale_crop_1_2_4.yaml	(in=1922) (out=757) (deflated 61%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter80_s2s_multi_scale_crop_1_2_4_resume.yaml	(in=2161) (out=818) (deflated 62%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter10_s2s_multi_scale_crop_1_2_4_s2s_teacher.yaml	(in=2048) (out=812) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter40_s2s_multi_scale_crop_1_2_4.yaml	(in=1899) (out=754) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/ViTCAP/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk_singlenode_test.yaml	(in=2055) (out=866) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk-4nodetest.yaml	(in=1897) (out=813) (deflated 57%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk.yaml	(in=2024) (out=856) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk-singlenodetest.yaml	(in=1896) (out=813) (deflated 57%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter10_s2s_multi_scale_crop_1_2_4_small_0.08.yaml	(in=1902) (out=753) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Distill_ENC_DEC_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_1024_iter80_s2s.yaml	(in=2031) (out=803) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_10_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding_tagger_from_scratch_gen-ratio_6e-1.yaml	(in=2008) (out=856) (deflated 57%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-5_iter_80_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding.yaml	(in=1919) (out=819) (deflated 57%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_80_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu.yaml	(in=1869) (out=806) (deflated 57%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_10_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding_tagger_from_scratch_gen-ratio_0e-1.yaml	(in=2008) (out=855) (deflated 57%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Distill_ENC_DEC_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_1024_iter80_s2s_init_vitbfocal40.yaml	(in=2197) (out=887) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_80_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1.yaml	(in=1809) (out=770) (deflated 57%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-5_iter_80_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu.yaml	(in=1897) (out=814) (deflated 57%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1.yaml	(in=1770) (out=754) (deflated 57%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_10_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding_tagger_from_scratch_gen-ratio_1e-1.yaml	(in=2008) (out=855) (deflated 57%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_10_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding_tagger_from_scratch_gen-ratio_2e-1.yaml	(in=2008) (out=853) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_10_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding_tagger_from_scratch_gen-ratio_3e-1.yaml	(in=2008) (out=856) (deflated 57%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_10_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding_tagger_from_scratch_gen-ratio_8e-1.yaml	(in=2008) (out=855) (deflated 57%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_80_vinvl_tags_ENC-DEC_Split_8_multiplier_0.1.yaml	(in=1809) (out=769) (deflated 57%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_10_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding_tagger_from_scratch_gen-ratio_1e-1_test.yaml	(in=2007) (out=852) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-5_iter_80_vinvl_tags_ENC-DEC_Split_8_multiplier_0.1_32_gpu.yaml	(in=1813) (out=766) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Distill_ENC_DEC_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_1024_iter80_s2s_init_vitbfocal40_tags_vvitbfocal40crop008.yaml	(in=2262) (out=903) (deflated 60%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_0logit_10hid_batch_size_4096_iter10_s2s_NOFC.yaml	(in=1769) (out=734) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/diff_topk/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/diff_topk/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.9_10logit.yaml	(in=2022) (out=787) (deflated 61%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/diff_topk/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.8_10logit.yaml	(in=2022) (out=790) (deflated 61%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_batch_size_4096_iter80_s2s.yaml	(in=1744) (out=722) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_0logit_10hid_batch_size_4096_iter10_s2s_FC.yaml	(in=1764) (out=729) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_batch_size_2048_iter10_s2s.yaml	(in=1744) (out=720) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_batch_size_4096_iter10_s2s.yaml	(in=1748) (out=722) (deflated 59%)
  adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_0logit_0hid_batch_size_4096_iter20.yaml	(in=1861) (out=784) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_0logit_0hid_batch_size_1024_iter20.yaml	(in=1861) (out=783) (deflated 58%)
  adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-5_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_1logit_0hid_batch_size_4096_iter7_resume_.yaml	(in=2054) (out=817) (deflated 60%)
  adding: aux_data/Jacob_config/VLP/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP/TaxCCSBUCocoVGCapSplit_TEST_512.yaml	(in=1202) (out=555) (deflated 54%)
  adding: aux_data/Jacob_config/VLP/multi_scale/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP/multi_scale/TaxCCSBUCocoVGCapSplit_MultiScale_VLP_Vilt-base_VLPS_BS1024_MaxIter30e_LR0.0001_Warm3e.yaml	(in=1431) (out=632) (deflated 56%)
  adding: aux_data/Jacob_config/VLP/multi_scale/others/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP/multi_scale/others/TaxCCSBUCocoVGCapSplit_MultiScale_VLP_Vilt-base_VLPS_BS512_MaxIter30e_LR0.0002_Warm3e_tokensample_378.yaml	(in=1440) (out=634) (deflated 56%)
  adding: aux_data/Jacob_config/VLP/multi_scale/others/TaxCCSBUCocoVGCapSplit_MultiScale_VLP_Vilt-base_VLPS_BS1024_MaxIter30e_LR0.0001_Warm3e_tokensample_378.yaml	(in=1458) (out=641) (deflated 56%)
  adding: aux_data/Jacob_config/VLP/multi_scale/others/TaxCCSBUCocoVGCapSplit_MultiScale_VLP_Vilt-base_VLPS_BS1024_MaxIter30e_LR0.0001_Warm3e.yaml	(in=1402) (out=616) (deflated 56%)
  adding: aux_data/Jacob_config/VLP/multi_scale/others/TaxCCSBUCocoVGCapSplit_MultiScale_VLP_Vilt-base_VLPS_BS1024_MaxIter30e_LR0.0002_Warm3e_tokensample_378.yaml	(in=1458) (out=643) (deflated 56%)
  adding: aux_data/Jacob_config/VLP/TaxCOCOCaption_TEST_512.yaml	(in=1169) (out=546) (deflated 53%)
  adding: aux_data/Jacob_config/VLP/Jacob_Vilt_VLP_TaxCCSBUCocoVGCap_iter_80_lr_1e-4.yaml	(in=1206) (out=558) (deflated 54%)
  adding: aux_data/Jacob_config/VLP/others/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Vilt-base_VLPS_BS4096_MaxIter40e_LR0.0001_Warm3e_dataaug_384transform.yaml	(in=894) (out=456) (deflated 49%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Vilt-base_VLPS_BS4096_MaxIter40e_LR0.0004_Warm3e.yaml	(in=838) (out=438) (deflated 48%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS4096_MaxIter10e_LR0.00056_Warm1e_Feff0f_Leff0f_base12.yaml	(in=967) (out=476) (deflated 51%)
  adding: aux_data/Jacob_config/VLP/others/COCO_VLP_MiniVLM_VLPS_BS2048_MaxIter100e_LR0.0004_Warm5e_Feff0f_Leff0f_base12.yaml	(in=848) (out=439) (deflated 48%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Vilt-base_VLPS_BS4096_MaxIter40e_LR0.0001_Warm3e_dataaug.yaml	(in=878) (out=449) (deflated 49%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS2048_MaxIter100e_LR0.0004_Warm5e_Feff0f_Leff3f_base12.yaml	(in=941) (out=485) (deflated 48%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS2048_MaxIter5e_LR0.0004_Warm1e_Feff0f_Leff0f_base12.yaml	(in=961) (out=474) (deflated 51%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS2048_MaxIter100e_LR0.0004_Warm5e_Feff0f_Leff0f_base12.yaml	(in=941) (out=482) (deflated 49%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Vilt-base_VLPS_BS4096_MaxIter40e_LR0.0001_Warm3e_dataaug_384transform_exp4.yaml	(in=904) (out=459) (deflated 49%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Vilt-base_VLPS_BS4096_MaxIter40e_LR0.0001_Warm3e.yaml	(in=840) (out=439) (deflated 48%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS4096_MaxIter30e_LR0.00056_Warm0e_Feff0f_Leff0f_base12_908.yaml	(in=1127) (out=533) (deflated 53%)
  adding: aux_data/Jacob_config/VLP/others/CC_VLP_MiniVLM_VLPS_BS2048_MaxIter100e_LR0.0004_Warm5e_Feff0f_Leff0f_base12.yaml	(in=882) (out=453) (deflated 49%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS4096_MaxIter1e_LR0.00056_Warm0e_Feff0f_Leff0f_base12.yaml	(in=964) (out=475) (deflated 51%)
  adding: aux_data/Jacob_config/VLP/others/COCO_VLP_MiniVLM_VLPS_BS2048_MaxIter100e_LR0.00008_Warm5e_Feff0f_Leff0f_base12_vlp765.yaml	(in=955) (out=466) (deflated 51%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Oscarbase_VLPS_BS2048_MaxIter100e_LR0.0002_Warm5e_Feff0fpeter_Leff0fpeter_base12.yaml	(in=999) (out=476) (deflated 52%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS4096_MaxIter50e_LR0.00056_Warm5e_Feff0f_Leff0f_base12.yaml	(in=967) (out=478) (deflated 51%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS4096_MaxIter20e_LR0.00056_Warm5e_Feff0f_Leff0f_base12.yaml	(in=967) (out=478) (deflated 51%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Oscarbase_VLPS_BS2048_MaxIter100e_LR0.0004_Warm5e_Feff0fpeter_Leff0fpeter_base12.yaml	(in=999) (out=475) (deflated 52%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Oscarbase_VLPS_BS2048_MaxIter100e_LR0.0001_Warm5e_Feff0fpeter_Leff0fpeter_base12.yaml	(in=999) (out=476) (deflated 52%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS4096_MaxIter50e_LR0.0008_Warm5e_Feff0f_Leff0f_base12.yaml	(in=964) (out=477) (deflated 51%)
  adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Vilt-base_VLPS_BS4096_MaxIter40e_LR0.0001_Warm3e_dataaug_384transform_exp5.yaml	(in=904) (out=459) (deflated 49%)
  adding: aux_data/Jacob_config/VLP/coco_bidirectional_finetune/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/VLP/coco_bidirectional_finetune/TaxCocoCaption_VLP_Vilt-base_VLPS_BS4096_MaxIter30e_LR5.0e-4_Warm3e.yaml	(in=1155) (out=527) (deflated 54%)
  adding: aux_data/Jacob_config/VLP/TaxCCSBUCocoVGCapSplit_TEST.yaml	(in=1202) (out=556) (deflated 54%)
  adding: aux_data/Jacob_config/CIDEr_optimize/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/CIDEr_optimize/cider_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_256_encoder_vit_base_patch16_384_without_VLP_multi_scale_fix_BS_64_Sample_200.yaml	(in=1595) (out=642) (deflated 60%)
  adding: aux_data/Jacob_config/CIDEr_optimize/cider_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_256_encoder_vit_base_patch16_384_without_VLP_multi_scale_fix_BS_64_Sample_276.yaml	(in=1596) (out=643) (deflated 60%)
  adding: aux_data/Jacob_config/CIDEr_optimize/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_1_epoch_5.yaml	(in=1754) (out=748) (deflated 57%)
  adding: aux_data/Jacob_config/CIDEr_optimize/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6.yaml	(in=1752) (out=751) (deflated 57%)
  adding: aux_data/Jacob_config/CIDEr_optimize/cider_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_256_encoder_vit_base_patch16_384_without_VLP_multi_scale_fix_BS_64_Sample_300.yaml	(in=1596) (out=642) (deflated 60%)
  adding: aux_data/Jacob_config/CIDEr_optimize/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_sample_0.4.yaml	(in=1723) (out=736) (deflated 57%)
  adding: aux_data/Jacob_config/CIDEr_optimize/cider_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_256_encoder_vit_base_patch16_384_without_VLP_multi_scale_fix_BS_16_Sample_276.yaml	(in=1596) (out=640) (deflated 60%)
  adding: aux_data/Jacob_config/CIDEr_optimize/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_NO_VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6.yaml	(in=1790) (out=777) (deflated 57%)
  adding: aux_data/Jacob_config/CIDEr_optimize/cider_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_256_encoder_vit_base_patch16_384_without_VLP_multi_scale_fix.yaml	(in=1561) (out=631) (deflated 60%)
  adding: aux_data/Jacob_config/CIDEr_optimize/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_sample_0.3.yaml	(in=1740) (out=743) (deflated 57%)
  adding: aux_data/Jacob_config/CIDEr_optimize/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_150_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_1_epoch_25.yaml	(in=1778) (out=758) (deflated 57%)
  adding: aux_data/Jacob_config/new_result/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/caption_Embedding/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/caption_Embedding/VinVL_Label_60_epoch_0.05_lrreduc/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/caption_Embedding/VinVL_Label_60_epoch_0.05_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.05_caption_emb.yaml	(in=1860) (out=804) (deflated 57%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/caption_Embedding/VinVL_Label_60_epoch_0.05_lrreduc/70_driver_log_0 (27).txt 	(in=3741815) (out=463374) (deflated 88%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_60_epoch_0.1_lrreduc/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_60_epoch_0.1_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml	(in=1828) (out=797) (deflated 56%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_60_epoch_0.1_lrreduc/70_driver_log_0 (27).txt 	(in=3834561) (out=472577) (deflated 88%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinL_Label_30_epoch_0.05_lrreduc/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinL_Label_30_epoch_0.05_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.05_CIDEr.yaml	(in=1831) (out=797) (deflated 56%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinL_Label_30_epoch_0.05_lrreduc/70_driver_log_0 (27).txt 	(in=2519172) (out=300760) (deflated 88%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_60_epoch_0.01_lrreduc/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_60_epoch_0.01_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.01.yaml	(in=1831) (out=797) (deflated 56%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_60_epoch_0.01_lrreduc/70_driver_log_0 (27).txt 	(in=3741815) (out=463374) (deflated 88%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_30_epoch_0.1_lrreduc/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_30_epoch_0.1_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_CIDEr_119.4.yaml	(in=1828) (out=796) (deflated 56%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/classifier_Embedding/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/classifier_Embedding/VinVL_Label_60_epoch_0.1_lrreduc/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/classifier_Embedding/VinVL_Label_60_epoch_0.1_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml	(in=1870) (out=811) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP.yaml	(in=1544) (out=677) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP_multi_scale_0ce_1logit.yaml	(in=1704) (out=720) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_0ce_1logit.yaml	(in=1704) (out=721) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale_tower_test.yaml	(in=2057) (out=737) (deflated 64%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_Zhiyuan-PyTorch-Test_1626887552_3db73349_correct+Distill.yaml	(in=2017) (out=779) (deflated 61%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale_4_8.yaml	(in=1809) (out=712) (deflated 61%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_tags_vitb_ENC-DEC_len70_conf_0.3.yaml	(in=1507) (out=628) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale.yaml	(in=2043) (out=821) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_random_0.2_inference_sample.yaml	(in=1856) (out=731) (deflated 61%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_Zhiyuan-PyTorch-Test_1626887552_3db73349.yaml	(in=2143) (out=857) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_uni_vlp_distill.yaml	(in=1862) (out=742) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_patch_select_0.9.yaml	(in=1787) (out=728) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_128_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.6.yaml	(in=1830) (out=746) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale_tower.yaml	(in=2065) (out=739) (deflated 64%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_no_tags.yaml	(in=1786) (out=715) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_topk_attn_token_select_0.6_layer_9_infer_sample.yaml	(in=2072) (out=799) (deflated 61%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_random_select_0.6_test_sample.yaml	(in=1920) (out=766) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_60_with_VLP_multi_scale_ENC-DEC.yaml	(in=1835) (out=728) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale_ClipViT.yaml	(in=1900) (out=754) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_CC_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_uni_vlp_distill_test.yaml	(in=1943) (out=750) (deflated 61%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9_patch_select_0.8.yaml	(in=1807) (out=732) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_multi_scale_token_sample_0.9_patch_select_identity.yaml	(in=1766) (out=709) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_uni_vlp_distill_test.yaml	(in=2014) (out=756) (deflated 62%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_tags_eff0f.yaml	(in=1789) (out=716) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_60_with_VLP_multi_scale_ENC-ENC.yaml	(in=1819) (out=722) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_30__vitbfocal40_vinvl_tags.yaml	(in=1757) (out=762) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_64_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.6.yaml	(in=1963) (out=751) (deflated 62%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_topk_token_select_0.6.yaml	(in=1961) (out=775) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vlinvits_ENC-DEC.yaml	(in=1617) (out=663) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_multi_scale_0.9.yaml	(in=1774) (out=731) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale_tower_seperate_cls_test.yaml	(in=2083) (out=745) (deflated 64%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_32_384_lr_1e-4_iter_30_without_VLP_mutual_tower.yaml	(in=1918) (out=724) (deflated 62%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_eff2f.yaml	(in=1779) (out=712) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.9_centroid.yaml	(in=1847) (out=749) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_16_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_4_8.yaml	(in=1828) (out=720) (deflated 61%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vlinvits.yaml	(in=1809) (out=723) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_80_iter_VLP_multi_scale.yaml	(in=2045) (out=814) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_480_lr_1e-4_iter_30.yaml	(in=1768) (out=707) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_7.yaml	(in=1927) (out=772) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9_patch_select_identity2.yaml	(in=1866) (out=755) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_generate_tags_ENC-DEC.yaml	(in=1426) (out=629) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_random_0.4_inference_sample.yaml	(in=1856) (out=730) (deflated 61%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_9.yaml	(in=1927) (out=772) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_0.9.yaml	(in=1774) (out=732) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_batch-size_256_ENC_DEC_encoder_vit_base_patch16_384_lr_5e-5_iter_30_uni_vlp_distill_80_iter_pretraining_with_vitbfocal10_tags.yaml	(in=2057) (out=846) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9.yaml	(in=1873) (out=763) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vinvl.yaml	(in=1789) (out=715) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_60_sqr_attention.yaml	(in=1814) (out=718) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_no_tags_ENC-DEC.yaml	(in=1829) (out=738) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_ENC-DEC.yaml	(in=1831) (out=728) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_sqr_attention_alter.yaml	(in=1847) (out=729) (deflated 61%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9_patch_select_0.6.yaml	(in=1809) (out=734) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_ENC-ENC.yaml	(in=1819) (out=722) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_Zhiyuan-PyTorch-Test_1626887552_3db73349_correct_eval.yaml	(in=1939) (out=763) (deflated 61%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_batch-size_256_ENC_DEC_encoder_vit_base_patch16_384_lr_5e-5_iter_30_uni_vlp_distill_80_iter_pretraining_inference.yaml	(in=1983) (out=822) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9-.yaml	(in=1889) (out=766) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale.yaml	(in=1823) (out=716) (deflated 61%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_30__vitbfocal10_vinvl_tags2.yaml	(in=1757) (out=761) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_tags_vitb_ENC-DEC_len70.yaml	(in=1489) (out=623) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9_patch_select_0.9.yaml	(in=1807) (out=732) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.3_centroid.yaml	(in=1847) (out=749) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.3.yaml	(in=1830) (out=748) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_Zhiyuan-PyTorch-Test_1626887552_3db73349_correct+Distill.yaml	(in=1998) (out=780) (deflated 61%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_sqr_attention.yaml	(in=1814) (out=717) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_without_VLP.yaml	(in=1513) (out=639) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_multi_scale_token_sample_0.9_token_select.yaml	(in=1728) (out=695) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.6_centroid.yaml	(in=1847) (out=748) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_80_iter_VLP_multi_scale3.yaml	(in=2046) (out=813) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_5.yaml	(in=1927) (out=772) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_batch-size_256_ENC_DEC_encoder_vit_base_patch16_384_lr_5e-5_iter_30_uni_vlp_distill_80_iter_pretraining.yaml	(in=1972) (out=815) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.9.yaml	(in=1830) (out=748) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_multi_scale_token_sample_0.9.yaml	(in=1720) (out=695) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_random_0.2.yaml	(in=1821) (out=723) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding-distill/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill.yaml	(in=2059) (out=865) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill_not_all_token.yaml	(in=2086) (out=872) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill_s=8.yaml	(in=2067) (out=871) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-single-tower-vinvl/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-single-tower-vinvl/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vinvl.yaml	(in=1758) (out=710) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal40crop008/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal40crop008/Logit_Vilt_captioning_testing_batch-size_1024_ENC_DEC_vit_base_patch32_384_lr_5e-5_iter_30_VinVL_tags_ENC-DEC.yaml	(in=1506) (out=625) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1_concat.yaml	(in=1764) (out=755) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1.yaml	(in=1751) (out=749) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1_split_8.yaml	(in=1781) (out=755) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_3e-5_iter_30_vitbfocal10_tags_ENC-DEC_vitbfocal40.yaml	(in=1695) (out=733) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.01_inference.yaml	(in=1749) (out=751) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_10caption_loss_1tag_loss.yaml	(in=1721) (out=737) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.2.yaml	(in=1750) (out=748) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_3e-5_iter_30_vitbfocal10_tags_ENC-DEC.yaml	(in=1671) (out=728) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_10caption_loss_1tag_loss.yaml	(in=1721) (out=737) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_1.yaml	(in=1745) (out=746) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_1_inference.yaml	(in=1740) (out=747) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1_differentiable-topk.yaml	(in=1816) (out=772) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.01.yaml	(in=1754) (out=749) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1_inference.yaml	(in=1744) (out=747) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.5.yaml	(in=1750) (out=748) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1_split_8_inference.yaml	(in=1774) (out=753) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1_concat_inference.yaml	(in=1758) (out=753) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-eff0f/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-eff0f/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_eff0f_ENC-DEC_len70.yaml	(in=1506) (out=626) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal10/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal10/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_tags_vitbfocal10_ENC-DEC_len70.yaml	(in=1483) (out=631) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal10/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vitbfocal10_ENC-DEC_len70.yaml	(in=1489) (out=621) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal10/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_5e-5_iter_30_tags_vitbfocal10_ENC-DEC_len70.yaml	(in=1489) (out=624) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal10/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vitbfocal10_ENC-DEC_len70.yaml	(in=1483) (out=628) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal10/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vitbfocal10_ENC-DEC_len70.yaml	(in=1504) (out=629) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal40/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal40/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vitbfocal40_ENC-DEC_len70.yaml	(in=1492) (out=625) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-VinVL/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-VinVL/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_VinVL_tags_ENC-DEC.yaml	(in=1464) (out=618) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_005_noexpand.yaml	(in=1811) (out=787) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_split_8.yaml	(in=1832) (out=797) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml	(in=1816) (out=792) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.2_split_8.yaml	(in=1832) (out=796) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_005.yaml	(in=1817) (out=793) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.2.yaml	(in=1816) (out=792) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_clsemb_all_tokens.yaml	(in=1897) (out=819) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_new_bertemb_test.yaml	(in=1865) (out=806) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert.yaml	(in=1917) (out=815) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_bertemb_all_tokens.yaml	(in=1900) (out=814) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_init_from_caption_only-noall.yaml	(in=1992) (out=833) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_init_from_caption_only.yaml	(in=1980) (out=827) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_pred_tag_caption.yaml	(in=1977) (out=832) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_new_bertemb.yaml	(in=1871) (out=808) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_no_tags.yaml	(in=1843) (out=803) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_init_from_caption-all+vinvl.yaml	(in=1993) (out=838) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_bertemb.yaml	(in=1863) (out=805) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_clsemb.yaml	(in=1860) (out=812) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk.yaml	(in=1903) (out=818) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_new_bertemb_all_tokens2.yaml	(in=1910) (out=819) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_fuse_pred_tag_caption.yaml	(in=2015) (out=847) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_init_from_caption+vinvl.yaml	(in=1974) (out=834) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_gt_tag_caption.yaml	(in=1971) (out=831) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_new_bertemb_all_tokens.yaml	(in=1908) (out=815) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_new_bertemb_all_tokens-gradient.yaml	(in=1961) (out=825) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_init_from_caption_oscar.yaml	(in=1981) (out=848) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.2_split_8.yaml	(in=1832) (out=796) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_split_8.yaml	(in=1832) (out=797) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml	(in=1810) (out=790) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_tie_tag_bert_weight.yaml	(in=1860) (out=801) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_bertemb.yaml	(in=1857) (out=803) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_clsemb.yaml	(in=1854) (out=809) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.3.yaml	(in=1816) (out=793) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_005_noexpand.yaml	(in=1811) (out=787) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_clsemb2_nograd.yaml	(in=1870) (out=814) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_noexpand.yaml	(in=1810) (out=786) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_clsemb_nograd.yaml	(in=1868) (out=814) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_005.yaml	(in=1817) (out=793) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_clsemb2_nograd_alltokens.yaml	(in=1907) (out=823) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_30_VinVL_tags_ENC-DEC.yaml	(in=1482) (out=624) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal40crop008/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal40crop008/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_30_VinVL_tags_ENC-DEC.yaml	(in=1503) (out=630) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal40crop008/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_30_VinVL_tags_ENC-DEC_conf_0.4.yaml	(in=1521) (out=634) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal40crop008/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_30_VinVL_tags_ENC-DEC_conf_0.8.yaml	(in=1521) (out=633) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal40crop008/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_30_VinVL_tags_ENC-DEC_conf_0.6.yaml	(in=1521) (out=634) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-VinVL/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-VinVL/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vinvl_ENC-DEC_len70.yaml	(in=1517) (out=628) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_patch_select_1.0.yaml	(in=1753) (out=719) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_Zhiyuan-PyTorch-Test_1626887552_3db73349_correct.yaml	(in=2160) (out=858) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_VLP_multi_scale_test.yaml	(in=1825) (out=719) (deflated 61%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_30__vitbfocal10_vinvl_tags.yaml	(in=1757) (out=762) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_linvits_test.yaml	(in=1789) (out=715) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_Zhiyuan-PyTorch-Test_1626596599_66f2f469.yaml	(in=2138) (out=856) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_tags_vitb_ENC-DEC_len70_conf_0.5.yaml	(in=1507) (out=628) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vinvl_ENC-DEC_len70.yaml	(in=1506) (out=622) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_generate_tags_ENC-DEC_inference.yaml	(in=1420) (out=627) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_128_encoder_vit_base_patch16_480_lr_1e-4_iter_30_tags_linvits.yaml	(in=1795) (out=719) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_16_224_lr_1e-4_iter_30_without_VLP_multi_two_tower.yaml	(in=1911) (out=723) (deflated 62%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_0.9.yaml	(in=2050) (out=821) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vinvl_ENC-DEC_len70_nopad.yaml	(in=1514) (out=632) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_2.yaml	(in=1927) (out=770) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.7.yaml	(in=1873) (out=764) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_16_384_lr_1e-4_iter_30_without_VLP_multi_two_tower.yaml	(in=1911) (out=725) (deflated 62%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_VinVL_tags_ENC-DEC.yaml	(in=1464) (out=618) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_80_iter_VLP_multi_scale2.yaml	(in=2046) (out=815) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_uni_vlp_distill_80_vinvl_tag-iter_pretraining.yaml	(in=2189) (out=883) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_multi_scale.yaml	(in=1741) (out=722) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_uni_vlp_distill_80_iter_pretraining_test.yaml	(in=2142) (out=784) (deflated 63%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.6.yaml	(in=1830) (out=746) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_topk_attn_token_select_0.6_layer_9_no_infer.yaml	(in=2067) (out=799) (deflated 61%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale_tower_large_test.yaml	(in=2095) (out=750) (deflated 64%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_uni_vlp_distill_80_vinvl-vitfocal40_tag-iter_pretraining.yaml	(in=2224) (out=893) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_batch-size_256_ENC_DEC_encoder_vit_base_patch16_384_lr_5e-5_iter_30_uni_vlp_distill_80_iter_pretraining_with_vinvl_tags.yaml	(in=2039) (out=834) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_tags_linvits.yaml	(in=1796) (out=719) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_eff0f.yaml	(in=1781) (out=714) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_self_attention_0.6.yaml	(in=2340) (out=898) (deflated 62%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_no_tags.yaml	(in=1567) (out=654) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_random_select_0.6.yaml	(in=1923) (out=768) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_linvits.yaml	(in=1795) (out=717) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_patch_select_0.6.yaml	(in=1788) (out=731) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_eff4f.yaml	(in=1779) (out=712) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_4.yaml	(in=1927) (out=770) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9_patch_select_identity3.yaml	(in=1866) (out=755) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_random_0.3.yaml	(in=1822) (out=721) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_no_tags_ENC-DEC_80_epoch_S2S_pre-training.yaml	(in=1875) (out=783) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_token_drop.yaml	(in=1856) (out=724) (deflated 61%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_ENC-DEC.yaml	(in=1835) (out=728) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_patch_select_0.3.yaml	(in=1787) (out=728) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.8.yaml	(in=1873) (out=763) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_uni_vlp_distill_80_iter_pretraining.yaml	(in=2125) (out=859) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_multi_scale_token_sample_0.9_patch_select.yaml	(in=1748) (out=703) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9_.yaml	(in=1882) (out=764) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_eff0f_test.yaml	(in=1775) (out=712) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_6.yaml	(in=1927) (out=771) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale_tower_large.yaml	(in=2077) (out=745) (deflated 64%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_no_tags.yaml	(in=1786) (out=715) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_3.yaml	(in=1927) (out=772) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_60_sqr_attention_alter.yaml	(in=1847) (out=729) (deflated 61%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_batch-size_256_ENC_DEC_encoder_vit_base_patch16_384_lr_5e-5_iter_30_uni_vlp_distill_80_iter_pretraining_with_vinvl_tags_inference.yaml	(in=1825) (out=745) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_tags_vinvl.yaml	(in=1789) (out=715) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_8.yaml	(in=1927) (out=772) (deflated 60%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale.yaml	(in=1767) (out=731) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_no_tags_ENC-DEC.yaml	(in=1446) (out=609) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vinvl_ENC-DEC.yaml	(in=1505) (out=624) (deflated 59%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale.yaml	(in=1682) (out=715) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP_multi_scale.yaml	(in=1682) (out=714) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP_multi_scale_0ce_1logit.yaml	(in=1704) (out=721) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_0ce_1logit_10hidden.yaml	(in=1736) (out=731) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1.0e-4_iter_30_without_VLP_multi_scale_token_sample_378.yaml	(in=1726) (out=736) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP_multi_scale.yaml	(in=1682) (out=714) (deflated 58%)
  adding: aux_data/Jacob_config/coco_Distill/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_1e-4_iter_30_without_VLP.yaml	(in=1546) (out=677) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_224_lr_1e-4_iter_30_without_VLP.yaml	(in=1546) (out=677) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/Textual_hid_Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_224_lr_1e-4_iter_30_without_VLP_hidden_weight_10.yaml	(in=1682) (out=716) (deflated 57%)
  adding: aux_data/Jacob_config/coco_Distill/Logit_Distill_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_2e-4_iter_30_without_VLP.yaml	(in=1547) (out=679) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/Logit_Distill_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_224_lr_2e-4_iter_30_without_VLP.yaml	(in=1547) (out=679) (deflated 56%)
  adding: aux_data/Jacob_config/coco_Distill/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_224_lr_1e-4_iter_30_without_VLP_debug.yaml	(in=1690) (out=732) (deflated 57%)
  adding: aux_data/Jacob_config/Tagger/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxOpenImagesV6_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_512.yaml	(in=1241) (out=574) (deflated 54%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxOpenImagesV6_B_Vilt_ViT_16_224_10_epoch_lr_0.02_BS_512_SGD.yaml	(in=1231) (out=589) (deflated 52%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_rank_crop_0.08.yaml	(in=1355) (out=610) (deflated 55%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_focal_0.5_2.yaml	(in=1281) (out=590) (deflated 54%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_512.yaml	(in=1249) (out=577) (deflated 54%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_rank_crop_0.08_inference.yaml	(in=1350) (out=610) (deflated 55%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.9_inference.yaml	(in=1350) (out=610) (deflated 55%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxOpenImagesV6_B_Vilt_ViT_16_224_10_epoch_lr_0.1_BS_512_SGD.yaml	(in=1228) (out=588) (deflated 52%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08.yaml	(in=1358) (out=609) (deflated 55%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024.yaml	(in=1252) (out=577) (deflated 54%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_rank_crop_0.9.yaml	(in=1352) (out=610) (deflated 55%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCocoCaption_B_Vilt_ViT_16_384_10_epoch_lr_2e-2_BS_256.yaml	(in=1241) (out=573) (deflated 54%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_rank.yaml	(in=1278) (out=589) (deflated 54%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_0.5_2_inference_tags.yaml	(in=1332) (out=603) (deflated 55%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_0.5_2.yaml	(in=1281) (out=590) (deflated 54%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_20_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_category_bert.yaml	(in=1539) (out=665) (deflated 57%)
  adding: aux_data/Jacob_config/Tagger/ablation/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/Tagger/ablation/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_caption+vinvl.yaml	(in=1416) (out=634) (deflated 55%)
  adding: aux_data/Jacob_config/Tagger/ablation/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_oscar.yaml	(in=1469) (out=662) (deflated 55%)
  adding: aux_data/Jacob_config/Tagger/ablation/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_caption+vinvl_all-tokens.yaml	(in=1438) (out=637) (deflated 56%)
  adding: aux_data/Jacob_config/Tagger/ablation/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_caption_only.yaml	(in=1423) (out=636) (deflated 55%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxOpenImagesV6_B_Vilt_ViT_16_224_10_epoch_lr_5e-5_BS_512.yaml	(in=1244) (out=576) (deflated 54%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_rank_inference_tags.yaml	(in=1329) (out=603) (deflated 55%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.9.yaml	(in=1355) (out=610) (deflated 55%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxOpenImagesV6_B_Vilt_ViT_16_224_10_epoch_lr_0.1_BS_512_SGD_sigmoid.yaml	(in=1274) (out=600) (deflated 53%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_focal_0.5_2_inference_tags.yaml	(in=1357) (out=615) (deflated 55%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_0.5_2_crop_0.08.yaml	(in=1281) (out=590) (deflated 54%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_inference.yaml	(in=1352) (out=608) (deflated 55%)
  adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxOpenImagesV6_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_512_inference_tags.yaml	(in=1272) (out=597) (deflated 53%)
  adding: aux_data/Jacob_config/Debug/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/Debug/vlp_uni_pipeline_distill_debub_vinvl.yaml	(in=2004) (out=819) (deflated 59%)
  adding: aux_data/Jacob_config/Debug/vqa_uni_pipeline_debug_test.yaml	(in=1505) (out=638) (deflated 58%)
  adding: aux_data/Jacob_config/Debug/checkpoint_uni_pipeline_debug.yaml	(in=1195) (out=558) (deflated 53%)
  adding: aux_data/Jacob_config/Debug/vqa_uni_pipeline_debug.yaml	(in=1450) (out=594) (deflated 59%)
  adding: aux_data/Jacob_config/Debug/distill_caption_uni_pipeline_debug.yaml	(in=6083) (out=1858) (deflated 69%)
  adding: aux_data/Jacob_config/Debug/distill_caption_uni_pipeline_debug_multi_tower.yaml	(in=2371) (out=848) (deflated 64%)
  adding: aux_data/Jacob_config/Debug/vqa_uni_pipeline_distill.yaml	(in=1786) (out=812) (deflated 55%)
  adding: aux_data/Jacob_config/Debug/kim_vqa_uni_pipeline_distill_debug.yaml	(in=1579) (out=734) (deflated 54%)
  adding: aux_data/Jacob_config/Debug/VinVL_Taxcococaption_bid_finetune.yaml	(in=1208) (out=544) (deflated 55%)
  adding: aux_data/Jacob_config/Debug/kim_captioning.yaml	(in=1516) (out=635) (deflated 58%)
  adding: aux_data/Jacob_config/Debug/nocaps_vilt_test.yaml	(in=1247) (out=597) (deflated 52%)
  adding: aux_data/Jacob_config/Debug/others/	(in=0) (out=0) (stored 0%)
  adding: aux_data/Jacob_config/Debug/others/VLP.yaml	(in=658) (out=366) (deflated 44%)
  adding: aux_data/Jacob_config/Debug/others/Retrieval.yaml	(in=702) (out=390) (deflated 44%)
  adding: aux_data/Jacob_config/Debug/others/Caption.yaml	(in=1086) (out=505) (deflated 53%)
  adding: aux_data/Jacob_config/Debug/others/Caption_test.yaml	(in=719) (out=378) (deflated 47%)
  adding: aux_data/Jacob_config/Debug/kim_vqa_uni_pipeline_debug.yaml	(in=1264) (out=574) (deflated 55%)
  adding: aux_data/Jacob_config/Debug/kim_vqa_uni_pipeline_debug_test.yaml	(in=1260) (out=572) (deflated 55%)
  adding: aux_data/Jacob_config/Debug/tagger_uni_pipeline_debug.yaml	(in=1944) (out=845) (deflated 57%)
  adding: aux_data/Jacob_config/Debug/vlp_uni_pipeline_distill_debug.yaml	(in=2320) (out=858) (deflated 63%)
  adding: aux_data/Jacob_config/Debug/nosample_inference.yaml	(in=1451) (out=621) (deflated 57%)
  adding: aux_data/Jacob_config/Debug/vinvl_caption_uni_pipeline_debug.yaml	(in=1119) (out=507) (deflated 55%)
  adding: aux_data/Jacob_config/Debug/caption_uni_pipeline_debug.yaml	(in=2538) (out=929) (deflated 63%)
  adding: aux_data/Jacob_config/Debug/caption_uni_pipeline_teacher.yaml	(in=1561) (out=688) (deflated 56%)
  adding: aux_data/Jacob_config/Debug/vlp_uni_pipeline_debug.yaml	(in=1507) (out=635) (deflated 58%)
  adding: aux_data/aml/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/Vision_GPU/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/Vision_GPU/config.json	(in=137) (out=120) (deflated 12%)
  adding: aux_data/aml/Vision_GPU/aml.yaml	(in=926) (out=477) (deflated 48%)
  adding: aux_data/aml/Vision_GPU/compute_target.json	(in=37) (out=37) (stored 0%)
  adding: aux_data/aml/128V100X8/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/128V100X8/config.json	(in=138) (out=121) (deflated 12%)
  adding: aux_data/aml/128V100X8/aml.yaml	(in=926) (out=483) (deflated 48%)
  adding: aux_data/aml/128V100X8/compute_target.json	(in=37) (out=37) (stored 0%)
  adding: aux_data/aml/config.json	(in=134) (out=118) (deflated 12%)
  adding: aux_data/aml/docker/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/docker/Vision_GPU/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/docker/Vision_GPU/config.json	(in=137) (out=120) (deflated 12%)
  adding: aux_data/aml/docker/Vision_GPU/compute_target.json	(in=37) (out=37) (stored 0%)
  adding: aux_data/aml/docker/pytorch1.6/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/docker/pytorch1.6/environment.json	(in=201) (out=133) (deflated 34%)
  adding: aux_data/aml/docker/pytorch1.4/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/docker/pytorch1.4/environment.json	(in=257) (out=152) (deflated 41%)
  adding: aux_data/aml/we3v32_eastus/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/we3v32_eastus/config.json	(in=158) (out=121) (deflated 23%)
  adding: aux_data/aml/we3v32_eastus/aml.yaml	(in=1026) (out=548) (deflated 47%)
  adding: aux_data/aml/we3v32_eastus/compute_target.json	(in=42) (out=42) (stored 0%)
  adding: aux_data/aml/aml.yaml	(in=926) (out=478) (deflated 48%)
  adding: aux_data/aml/others/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/others/aml_test.yaml	(in=2406) (out=741) (deflated 69%)
  adding: aux_data/aml/compute_target.json	(in=37) (out=37) (stored 0%)
  adding: aux_data/aml/VLP32GB/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/VLP32GB/config.json	(in=134) (out=118) (deflated 12%)
  adding: aux_data/aml/VLP32GB/aml.yaml	(in=926) (out=478) (deflated 48%)
  adding: aux_data/aml/VLP32GB/compute_target.json	(in=37) (out=37) (stored 0%)
  adding: aux_data/aml/datablobs/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/datablobs/vigeast/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/datablobs/vigeast/datastore.json	(in=360) (out=279) (deflated 23%)
  adding: aux_data/aml/datablobs/vig/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/datablobs/vig/datastore.json	(in=352) (out=267) (deflated 24%)
  adding: aux_data/aml/CustVisP100/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/CustVisP100/config.json	(in=138) (out=121) (deflated 12%)
  adding: aux_data/aml/CustVisP100/compute_target.json	(in=37) (out=37) (stored 0%)
  adding: aux_data/aml/cluster_base.yaml	(in=1868) (out=706) (deflated 62%)
  adding: aux_data/aml/we3v32/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/we3v32/config.json	(in=159) (out=122) (deflated 23%)
  adding: aux_data/aml/we3v32/aml.yaml	(in=1020) (out=546) (deflated 46%)
  adding: aux_data/aml/we3v32/compute_target.json	(in=43) (out=43) (stored 0%)
  adding: aux_data/aml/CustVis32GB/	(in=0) (out=0) (stored 0%)
  adding: aux_data/aml/CustVis32GB/config.json	(in=138) (out=121) (deflated 12%)
  adding: aux_data/aml/CustVis32GB/aml.yaml	(in=926) (out=478) (deflated 48%)
  adding: aux_data/aml/CustVis32GB/compute_target.json	(in=37) (out=37) (stored 0%)
  adding: aux_data/untrained_config/	(in=0) (out=0) (stored 0%)
  adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_224/	(in=0) (out=0) (stored 0%)
  adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_224/config.json	(in=570) (out=287) (deflated 50%)
  adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_224/VILT-L12-H784-uncased-vocab-nlg.txt 	(in=231478) (out=109832) (deflated 53%)
  adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_224/vocab.txt 	(in=231508) (out=109776) (deflated 53%)
  adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_384/	(in=0) (out=0) (stored 0%)
  adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json	(in=570) (out=288) (deflated 49%)
  adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_384/VILT-L12-H784-uncased-vocab-nlg.txt 	(in=231478) (out=109832) (deflated 53%)
  adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_384/vocab.txt 	(in=231508) (out=109776) (deflated 53%)
  adding: aux_data/untrained_config/MiniLM-L12-H384-uncased-rand/	(in=0) (out=0) (stored 0%)
  adding: aux_data/untrained_config/MiniLM-L12-H384-uncased-rand/special_tokens_map.json	(in=112) (out=67) (deflated 40%)
  adding: aux_data/untrained_config/MiniLM-L12-H384-uncased-rand/config.json	(in=313) (out=167) (deflated 47%)
  adding: aux_data/untrained_config/MiniLM-L12-H384-uncased-rand/minilm-l12-h384-uncased-vocab-nlg.txt 	(in=231478) (out=109832) (deflated 53%)
  adding: aux_data/untrained_config/MiniLM-L12-H384-uncased-rand/vocab.txt 	(in=231508) (out=109776) (deflated 53%)
  adding: aux_data/untrained_config/MiniLM-L12-H384-uncased-rand/added_tokens.json	(in=2) (out=2) (stored 0%)
  adding: aux_data/untrained_config/Oscar-L12-H784-uncased/	(in=0) (out=0) (stored 0%)
  adding: aux_data/untrained_config/Oscar-L12-H784-uncased/config.json	(in=340) (out=180) (deflated 47%)
  adding: aux_data/untrained_config/Oscar-L12-H784-uncased/Oscar-L12-H784-uncased-vocab-nlg.txt 	(in=231478) (out=109832) (deflated 53%)
  adding: aux_data/untrained_config/Oscar-L12-H784-uncased/vocab.txt 	(in=231508) (out=109776) (deflated 53%)
  adding: aux_data/untrained_config/VILT-L12-H784-uncased_32_384/	(in=0) (out=0) (stored 0%)
  adding: aux_data/untrained_config/VILT-L12-H784-uncased_32_384/config.json	(in=570) (out=288) (deflated 49%)
  adding: aux_data/untrained_config/VILT-L12-H784-uncased_32_384/VILT-L12-H784-uncased-vocab-nlg.txt 	(in=231478) (out=109832) (deflated 53%)
  adding: aux_data/untrained_config/VILT-L12-H784-uncased_32_384/vocab.txt 	(in=231508) (out=109776) (deflated 53%)
  adding: compile.aml.sh	(in=966) (out=329) (deflated 66%)
  adding: dog.jpg 	(in=145305) (out=144651) (deflated 0%)
  adding: entry.py	(in=227) (out=127) (deflated 44%)
  adding: flops.pdf	(in=28598) (out=21980) (deflated 23%)
  adding: images/	(in=0) (out=0) (stored 0%)
  adding: mask_output	(in=116) (out=116) (stored 0%)
  adding: models	(in=121) (out=121) (stored 0%)
  adding: requirements.txt	(in=627) (out=386) (deflated 38%)
  adding: scripts/	(in=0) (out=0) (stored 0%)
  adding: scripts/irisextract.py	(in=2337) (out=940) (deflated 60%)
  adding: scripts/mergebn2.py	(in=4335) (out=1417) (deflated 67%)
  adding: scripts/model_initialization.py	(in=5278) (out=1690) (deflated 68%)
  adding: scripts/qd_pytorch.py	(in=0) (out=0) (stored 0%)
  adding: scripts/model_inference.py	(in=4962) (out=1837) (deflated 63%)
  adding: scripts/torch_from_imagenet.py	(in=11565) (out=3238) (deflated 72%)
  adding: scripts/taxonomy.py	(in=29618) (out=7273) (deflated 75%)
  adding: scripts/trainrpn.py	(in=6746) (out=2443) (deflated 64%)
  adding: scripts/gen_rpnprototxt.py	(in=8567) (out=1988) (deflated 77%)
  adding: scripts/qd_const.py	(in=108) (out=63) (deflated 42%)
  adding: scripts/wt_stats.py	(in=3271) (out=1299) (deflated 60%)
  adding: scripts/share.py	(in=1434) (out=512) (deflated 64%)
  adding: scripts/cocoeval.py	(in=4618) (out=1454) (deflated 69%)
  adding: scripts/qd_maskrcnn.py	(in=36431) (out=8161) (deflated 78%)
  adding: scripts/prepare_voc.py	(in=6890) (out=2109) (deflated 69%)
  adding: scripts/convert_to_tsv.py	(in=12824) (out=3892) (deflated 70%)
  adding: scripts/eval.py	(in=3897) (out=1318) (deflated 66%)
  adding: scripts/process_image.py	(in=5976) (out=1793) (deflated 70%)
  adding: scripts/ssddet.py	(in=8743) (out=3129) (deflated 64%)
  adding: scripts/torch_transfer_learning.py	(in=4888) (out=1611) (deflated 67%)
  adding: scripts/a.py	(in=1309) (out=495) (deflated 62%)
  adding: scripts/process_tsv.py 	(in=180050) (out=33296) (deflated 82%)
  adding: scripts/_init_paths.py	(in=479) (out=269) (deflated 44%)
  adding: scripts/backup.py	(in=1283) (out=519) (deflated 60%)
  adding: scripts/runt.py	(in=54347) (out=12867) (deflated 76%)
  adding: scripts/print_result.py	(in=3089) (out=1058) (deflated 66%)
  adding: scripts/__init__.py	(in=0) (out=0) (stored 0%)
  adding: scripts/tsv_io.py	(in=28353) (out=6035) (deflated 79%)
  adding: scripts/qd_lstm.py	(in=8710) (out=3402) (deflated 61%)
  adding: scripts/yolodet.py	(in=29313) (out=7050) (deflated 76%)
  adding: scripts/email_util.py	(in=563) (out=273) (deflated 52%)
  adding: scripts/setup_pyfrcn.py	(in=1601) (out=737) (deflated 54%)
  adding: scripts/remote_run.py	(in=7891) (out=2104) (deflated 73%)
  adding: scripts/tsvdet.py	(in=9866) (out=2788) (deflated 72%)
  adding: scripts/test_unit.py	(in=8306) (out=1729) (deflated 79%)
  adding: scripts/setup_caffe.py	(in=1467) (out=645) (deflated 56%)
  adding: scripts/lineidx.py	(in=581) (out=293) (deflated 50%)
  adding: scripts/yoloinit.py	(in=24993) (out=4740) (deflated 81%)
  adding: scripts/deteval.py	(in=39) (out=39) (stored 0%)
  adding: scripts/synsetizer.py	(in=3706) (out=1266) (deflated 66%)
  adding: scripts/pytablemd.py	(in=3492) (out=1236) (deflated 65%)
  adding: scripts/train.py	(in=7527) (out=2663) (deflated 65%)
  adding: scripts/vis_bkg.py	(in=3103) (out=1019) (deflated 67%)
  adding: scripts/roiextract.py	(in=8593) (out=3069) (deflated 64%)
  adding: scripts/mergebn.py	(in=4785) (out=1448) (deflated 70%)
  adding: scripts/tools.py	(in=17660) (out=4831) (deflated 73%)
  adding: scripts/yoloeval.py	(in=12288) (out=3866) (deflated 69%)
  adding: scripts/exps.py	(in=64) (out=61) (deflated 5%)
  adding: scripts/latex_writer.py	(in=34) (out=34) (stored 0%)
  adding: scripts/hdf5datalayer.py	(in=1641) (out=669) (deflated 59%)
  adding: scripts/create_mnist.py	(in=2610) (out=915) (deflated 65%)
  adding: scripts/gen_prototxt.py	(in=3928) (out=1283) (deflated 67%)
  adding: scripts/msoftmax.py	(in=40877) (out=6906) (deflated 83%)
  adding: scripts/q_gen_csv.py	(in=9776) (out=2599) (deflated 73%)
  adding: scripts/iristrain.py	(in=9198) (out=2972) (deflated 68%)
  adding: scripts/process_dataset.py	(in=11721) (out=2798) (deflated 76%)
  adding: scripts/yolotree_init.py	(in=18746) (out=4223) (deflated 77%)
  adding: scripts/garbage_collector.py	(in=2396) (out=854) (deflated 64%)
  adding: scripts/drawresults.py	(in=3502) (out=1271) (deflated 64%)
  adding: scripts/deteval_voc.py	(in=6279) (out=2111) (deflated 66%)
  adding: scripts/wordtree.py	(in=1744) (out=581) (deflated 67%)
  adding: scripts/demo_detection.py	(in=14418) (out=3690) (deflated 74%)
  adding: scripts/qd_common.py 	(in=66902) (out=16658) (deflated 75%)
  adding: scripts/templatenet.py	(in=335) (out=221) (deflated 34%)
  adding: scripts/qd_util.py 	(in=352217) (out=72775) (deflated 79%)
  adding: scripts/rpneval.py	(in=4845) (out=1699) (deflated 65%)
  adding: src/	(in=0) (out=0) (stored 0%)
  adding: src/linear_attention_transformer/	(in=0) (out=0) (stored 0%)
  adding: src/linear_attention_transformer/autoregressive_wrapper.py	(in=3575) (out=1242) (deflated 65%)
  adding: src/linear_attention_transformer/__init__.py	(in=339) (out=130) (deflated 62%)
  adding: src/linear_attention_transformer/linear_attention_transformer.py	(in=19083) (out=4720) (deflated 75%)
  adding: src/linear_attention_transformer/autopadder.py	(in=2102) (out=741) (deflated 65%)
  adding: src/linear_attention_transformer/reversible.py	(in=6104) (out=1840) (deflated 70%)
  adding: src/linear_attention_transformer/images.py	(in=1842) (out=621) (deflated 66%)
  adding: src/qd/	(in=0) (out=0) (stored 0%)
  adding: src/qd/evaluate/	(in=0) (out=0) (stored 0%)
  adding: src/qd/evaluate/evaluate_openimages_google.py	(in=27126) (out=6260) (deflated 77%)
  adding: src/qd/evaluate/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/evaluate/oid_hierarchical_labels_expansion_tsv.py	(in=8680) (out=2503) (deflated 71%)
  adding: src/qd/qd_pytorch.py 	(in=131938) (out=28629) (deflated 78%)
  adding: src/qd/examples.py	(in=679) (out=286) (deflated 58%)
  adding: src/qd/taxonomy.py	(in=29726) (out=7349) (deflated 75%)
  adding: src/qd/unittest/	(in=0) (out=0) (stored 0%)
  adding: src/qd/unittest/test_qd_common.py	(in=6192) (out=1629) (deflated 74%)
  adding: src/qd/unittest/test_philly.py	(in=554) (out=240) (deflated 57%)
  adding: src/qd/unittest/test_masktsvdataset.py	(in=3254) (out=933) (deflated 71%)
  adding: src/qd/unittest/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/unittest/test_maskrcnn.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/unittest/test_tsvdatasetdb.py	(in=1754) (out=443) (deflated 75%)
  adding: src/qd/unittest/test_pytorch.py	(in=5194) (out=1491) (deflated 71%)
  adding: src/qd/unittest/test_process_tsv.py	(in=1272) (out=483) (deflated 62%)
  adding: src/qd/unittest/test_cloud_storage.py	(in=708) (out=232) (deflated 67%)
  adding: src/qd/unittest/test_layers.py	(in=1527) (out=566) (deflated 63%)
  adding: src/qd/unittest/test_mmtsvdataset.py	(in=2574) (out=882) (deflated 66%)
  adding: src/qd/philly.py	(in=300) (out=201) (deflated 33%)
  adding: src/qd/pipeline.py	(in=54481) (out=12104) (deflated 78%)
  adding: src/qd/cocoeval.py	(in=4757) (out=1490) (deflated 69%)
  adding: src/qd/qd_maskrcnn.py	(in=49926) (out=12035) (deflated 76%)
  adding: src/qd/qd_yolov2pt.py	(in=5183) (out=1516) (deflated 71%)
  adding: src/qd/acc_query.py	(in=2471) (out=861) (deflated 65%)
  adding: src/qd/batch_process.py	(in=9892) (out=2278) (deflated 77%)
  adding: src/qd/image_text_align.py	(in=12435) (out=2801) (deflated 77%)
  adding: src/qd/torch_common.py	(in=38968) (out=9661) (deflated 75%)
  adding: src/qd/process_image.py	(in=10643) (out=3052) (deflated 71%)
  adding: src/qd/mask/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/structures/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/structures/tsv_file.py	(in=2484) (out=892) (deflated 64%)
  adding: src/qd/mask/structures/segmentation_mask.py	(in=17446) (out=3939) (deflated 77%)
  adding: src/qd/mask/structures/image_list.py	(in=2664) (out=992) (deflated 63%)
  adding: src/qd/mask/structures/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/structures/__init__.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/structures/boxlist_ops.py	(in=6609) (out=1950) (deflated 70%)
  adding: src/qd/mask/structures/keypoint.py	(in=6555) (out=1795) (deflated 73%)
  adding: src/qd/mask/structures/bounding_box.py	(in=11698) (out=2887) (deflated 75%)
  adding: src/qd/mask/layers/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/bert/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/bert/tokenization_bert.py	(in=20945) (out=5422) (deflated 74%)
  adding: src/qd/mask/layers/bert/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/bert/file_utils.py	(in=8876) (out=2861) (deflated 68%)
  adding: src/qd/mask/layers/bert/modeling_outputs.py	(in=36759) (out=2071) (deflated 94%)
  adding: src/qd/mask/layers/bert/modeling_utils.py 	(in=78488) (out=17476) (deflated 78%)
  adding: src/qd/mask/layers/bert/__init__.py	(in=1436) (out=488) (deflated 66%)
  adding: src/qd/mask/layers/bert/modeling_bert.py 	(in=400457) (out=36841) (deflated 91%)
  adding: src/qd/mask/layers/bert/others/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/bert/others/modeling_bert.py 	(in=458281) (out=36277) (deflated 92%)
  adding: src/qd/mask/layers/bert/activations.py	(in=1723) (out=707) (deflated 59%)
  adding: src/qd/mask/layers/bert/modeling_mobilebert.py 	(in=69244) (out=12979) (deflated 81%)
  adding: src/qd/mask/layers/bert/tokenization_utils.py	(in=21245) (out=5003) (deflated 76%)
  adding: src/qd/mask/layers/others_bert/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/tokenization_bert.py	(in=19521) (out=5148) (deflated 74%)
  adding: src/qd/mask/layers/others_bert/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/file_utils.py	(in=8876) (out=2861) (deflated 68%)
  adding: src/qd/mask/layers/others_bert/modeling_vilt.py	(in=51587) (out=11246) (deflated 78%)
  adding: src/qd/mask/layers/others_bert/modeling_outputs.py	(in=36759) (out=2071) (deflated 94%)
  adding: src/qd/mask/layers/others_bert/modeling_utils.py 	(in=77361) (out=17172) (deflated 78%)
  adding: src/qd/mask/layers/others_bert/__init__.py	(in=494) (out=263) (deflated 47%)
  adding: src/qd/mask/layers/others_bert/modeling_bert.py 	(in=226036) (out=24196) (deflated 89%)
  adding: src/qd/mask/layers/others_bert/fig/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/fig/architecture_v2.pdf 	(in=143473) (out=130471) (deflated 9%)
  adding: src/qd/mask/layers/others_bert/fig/acc.eps	(in=35077) (out=11654) (deflated 67%)
  adding: src/qd/mask/layers/others_bert/fig/params.pdf	(in=15495) (out=11682) (deflated 25%)
  adding: src/qd/mask/layers/others_bert/fig/flops.pdf	(in=18494) (out=14218) (deflated 23%)
  adding: src/qd/mask/layers/others_bert/fig/docProps/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/fig/docProps/core.xml	(in=691) (out=353) (deflated 49%)
  adding: src/qd/mask/layers/others_bert/fig/docProps/thumbnail.jpeg	(in=9569) (out=9019) (deflated 6%)
  adding: src/qd/mask/layers/others_bert/fig/docProps/app.xml	(in=1353) (out=532) (deflated 61%)
  adding: src/qd/mask/layers/others_bert/fig/flops.eps	(in=30629) (out=11224) (deflated 63%)
  adding: src/qd/mask/layers/others_bert/fig/params.eps	(in=26204) (out=9257) (deflated 65%)
  adding: src/qd/mask/layers/others_bert/fig/vqa.pdf	(in=17354) (out=13292) (deflated 23%)
  adding: src/qd/mask/layers/others_bert/fig/architecture_v3.pdf 	(in=145597) (out=130418) (deflated 10%)
  adding: src/qd/mask/layers/others_bert/fig/docMetadata/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/fig/docMetadata/LabelInfo.xml	(in=323) (out=238) (deflated 26%)
  adding: src/qd/mask/layers/others_bert/fig/architecture1.pdf 	(in=117185) (out=104869) (deflated 11%)
  adding: src/qd/mask/layers/others_bert/fig/cider.pdf	(in=14564) (out=10997) (deflated 24%)
  adding: src/qd/mask/layers/others_bert/fig/bert.pdf	(in=19385) (out=14670) (deflated 24%)
  adding: src/qd/mask/layers/others_bert/fig/vqa.eps	(in=27463) (out=10284) (deflated 63%)
  adding: src/qd/mask/layers/others_bert/fig/vary_detector.eps	(in=40582) (out=12895) (deflated 68%)
  adding: src/qd/mask/layers/others_bert/fig/arch-crop.pdf 	(in=139013) (out=134787) (deflated 3%)
  adding: src/qd/mask/layers/others_bert/fig/vary_detector.pdf	(in=22581) (out=17984) (deflated 20%)
  adding: src/qd/mask/layers/others_bert/fig/cider.eps	(in=22108) (out=8379) (deflated 62%)
  adding: src/qd/mask/layers/others_bert/fig/arch.pdf 	(in=151742) (out=139606) (deflated 8%)
  adding: src/qd/mask/layers/others_bert/fig/param.pdf	(in=15572) (out=12009) (deflated 23%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/changesInfos/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/changesInfos/changesInfo1.xml	(in=15571) (out=2242) (deflated 86%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slides/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slides/slide1.xml 	(in=99995) (out=8894) (deflated 91%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slides/_rels/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slides/_rels/slide1.xml.rels	(in=1665) (out=274) (deflated 84%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/presentation.xml	(in=3212) (out=544) (deflated 83%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout11.xml	(in=4200) (out=1168) (deflated 72%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout3.xml	(in=5442) (out=1302) (deflated 76%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout7.xml	(in=2550) (out=892) (deflated 65%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout4.xml	(in=4975) (out=1184) (deflated 76%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout1.xml	(in=4678) (out=1250) (deflated 73%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout6.xml	(in=3064) (out=969) (deflated 68%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout2.xml	(in=3921) (out=1080) (deflated 72%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout10.xml	(in=3976) (out=1114) (deflated 72%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout8.xml	(in=5952) (out=1421) (deflated 76%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout5.xml	(in=7938) (out=1512) (deflated 81%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout9.xml	(in=5899) (out=1367) (deflated 77%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout8.xml.rels	(in=311) (out=182) (deflated 41%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout6.xml.rels	(in=311) (out=182) (deflated 41%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout3.xml.rels	(in=311) (out=182) (deflated 41%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout11.xml.rels	(in=311) (out=182) (deflated 41%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout5.xml.rels	(in=311) (out=182) (deflated 41%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout10.xml.rels	(in=311) (out=182) (deflated 41%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout2.xml.rels	(in=311) (out=182) (deflated 41%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout1.xml.rels	(in=311) (out=182) (deflated 41%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout4.xml.rels	(in=311) (out=182) (deflated 41%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout7.xml.rels	(in=311) (out=182) (deflated 41%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout9.xml.rels	(in=311) (out=182) (deflated 41%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/tableStyles.xml	(in=182) (out=165) (deflated 9%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/presProps.xml	(in=964) (out=443) (deflated 54%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/theme/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/theme/theme1.xml	(in=8399) (out=1692) (deflated 80%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/viewProps.xml	(in=812) (out=382) (deflated 53%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/revisionInfo.xml	(in=429) (out=270) (deflated 37%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/media/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/media/image2.png	(in=792) (out=648) (deflated 18%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/media/image5.png	(in=549) (out=464) (deflated 15%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/media/image1.jpeg 	(in=127146) (out=126948) (deflated 0%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/media/image9.png	(in=721) (out=634) (deflated 12%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/media/image4.png 	(in=114842) (out=114862) (deflated 0%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/media/image6.png	(in=596) (out=514) (deflated 14%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/media/image7.png	(in=594) (out=507) (deflated 15%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/media/image3.png 	(in=78460) (out=78475) (deflated 0%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/media/image10.png	(in=771) (out=688) (deflated 11%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/media/image8.png	(in=643) (out=558) (deflated 13%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideMasters/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideMasters/slideMaster1.xml	(in=13876) (out=2008) (deflated 86%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideMasters/_rels/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/slideMasters/_rels/slideMaster1.xml.rels	(in=1991) (out=271) (deflated 86%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/_rels/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/fig/ppt/_rels/presentation.xml.rels	(in=1246) (out=321) (deflated 74%)
  adding: src/qd/mask/layers/others_bert/fig/acc.pdf	(in=19350) (out=14583) (deflated 25%)
  adding: src/qd/mask/layers/others_bert/fig/_Content_Types_.xml	(in=3528) (out=499) (deflated 86%)
  adding: src/qd/mask/layers/others_bert/fig/_rels/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/others_bert/fig/bert.eps	(in=35521) (out=11205) (deflated 68%)
  adding: src/qd/mask/layers/others_bert/activations.py	(in=1723) (out=707) (deflated 59%)
  adding: src/qd/mask/layers/others_bert/nce_modeling_bert.py 	(in=97757) (out=17827) (deflated 82%)
  adding: src/qd/mask/layers/others_bert/modeling_mobilebert.py 	(in=69244) (out=12979) (deflated 81%)
  adding: src/qd/mask/layers/others_bert/tokenization_utils.py	(in=20106) (out=4825) (deflated 76%)
  adding: src/qd/mask/layers/clip/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/clip/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/clip/__init__.py	(in=1) (out=1) (stored 0%)
  adding: src/qd/mask/layers/clip/simple_tokenizer.py	(in=4632) (out=1724) (deflated 63%)
  adding: src/qd/mask/layers/clip/model.py	(in=27045) (out=5540) (deflated 80%)
  adding: src/qd/mask/layers/clip/clip.py	(in=8363) (out=3075) (deflated 63%)
  adding: src/qd/mask/layers/scale.py	(in=270) (out=164) (deflated 39%)
  adding: src/qd/mask/layers/dcn/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/dcn/deform_conv_func.py	(in=8496) (out=1665) (deflated 80%)
  adding: src/qd/mask/layers/dcn/deform_pool_func.py	(in=2648) (out=695) (deflated 74%)
  adding: src/qd/mask/layers/dcn/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/dcn/__init__.py	(in=102) (out=88) (deflated 14%)
  adding: src/qd/mask/layers/dcn/deform_conv_module.py	(in=6076) (out=1163) (deflated 81%)
  adding: src/qd/mask/layers/dcn/deform_pool_module.py	(in=6307) (out=784) (deflated 88%)
  adding: src/qd/mask/layers/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/__init__.py	(in=1442) (out=445) (deflated 69%)
  adding: src/qd/mask/layers/_utils.py	(in=1165) (out=470) (deflated 60%)
  adding: src/qd/mask/layers/batch_norm.py	(in=1091) (out=438) (deflated 60%)
  adding: src/qd/mask/layers/vilt/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/vilt/transforms/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/vilt/transforms/utils.py	(in=1645) (out=666) (deflated 60%)
  adding: src/qd/mask/layers/vilt/transforms/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/vilt/transforms/__init__.py	(in=301) (out=148) (deflated 51%)
  adding: src/qd/mask/layers/vilt/transforms/randaug.py	(in=7025) (out=1962) (deflated 72%)
  adding: src/qd/mask/layers/vilt/transforms/pixelbert.py	(in=765) (out=279) (deflated 64%)
  adding: src/qd/mask/layers/vilt/config.py	(in=6216) (out=1343) (deflated 78%)
  adding: src/qd/mask/layers/vilt/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/vilt/__init__.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/vilt/datamodules/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/vilt/datamodules/datamodule_base.py	(in=5637) (out=1072) (deflated 81%)
  adding: src/qd/mask/layers/vilt/datamodules/sbu_datamodule.py	(in=375) (out=184) (deflated 51%)
  adding: src/qd/mask/layers/vilt/datamodules/vqav2_datamodule.py	(in=1413) (out=488) (deflated 65%)
  adding: src/qd/mask/layers/vilt/datamodules/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/vilt/datamodules/multitask_datamodule.py	(in=2712) (out=662) (deflated 76%)
  adding: src/qd/mask/layers/vilt/datamodules/__init__.py	(in=707) (out=221) (deflated 69%)
  adding: src/qd/mask/layers/vilt/datamodules/nlvr2_datamodule.py	(in=362) (out=184) (deflated 49%)
  adding: src/qd/mask/layers/vilt/datamodules/coco_caption_karpathy_datamodule.py	(in=496) (out=200) (deflated 60%)
  adding: src/qd/mask/layers/vilt/datamodules/conceptual_caption_datamodule.py	(in=396) (out=185) (deflated 53%)
  adding: src/qd/mask/layers/vilt/datamodules/f30k_caption_karpathy_datamodule.py	(in=496) (out=204) (deflated 59%)
  adding: src/qd/mask/layers/vilt/datamodules/vg_caption_datamodule.py	(in=401) (out=189) (deflated 53%)
  adding: src/qd/mask/layers/vilt/datasets/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/vilt/datasets/vg_caption_dataset.py	(in=506) (out=246) (deflated 51%)
  adding: src/qd/mask/layers/vilt/datasets/f30k_caption_karpathy_dataset.py	(in=615) (out=266) (deflated 57%)
  adding: src/qd/mask/layers/vilt/datasets/sbu_caption_dataset.py	(in=543) (out=267) (deflated 51%)
  adding: src/qd/mask/layers/vilt/datasets/vqav2_dataset.py	(in=1480) (out=481) (deflated 68%)
  adding: src/qd/mask/layers/vilt/datasets/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/vilt/datasets/conceptual_caption_dataset.py	(in=598) (out=284) (deflated 53%)
  adding: src/qd/mask/layers/vilt/datasets/__init__.py	(in=395) (out=148) (deflated 63%)
  adding: src/qd/mask/layers/vilt/datasets/nlvr2_dataset.py	(in=1610) (out=569) (deflated 65%)
  adding: src/qd/mask/layers/vilt/datasets/base_dataset.py	(in=8904) (out=2283) (deflated 74%)
  adding: src/qd/mask/layers/vilt/datasets/coco_caption_karpathy_dataset.py	(in=970) (out=389) (deflated 60%)
  adding: src/qd/mask/layers/vilt/modules/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/vilt/modules/heads.py	(in=1569) (out=443) (deflated 72%)
  adding: src/qd/mask/layers/vilt/modules/vision_transformer.py	(in=49230) (out=8865) (deflated 82%)
  adding: src/qd/mask/layers/vilt/modules/vilt_module.py	(in=4869) (out=1406) (deflated 71%)
  adding: src/qd/mask/layers/vilt/modules/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/vilt/modules/__init__.py	(in=73) (out=66) (deflated 10%)
  adding: src/qd/mask/layers/vilt/modules/dist_utils.py	(in=7814) (out=2322) (deflated 70%)
  adding: src/qd/mask/layers/vilt/modules/vilt_utils.py	(in=10912) (out=1854) (deflated 83%)
  adding: src/qd/mask/layers/vilt/modules/objectives.py	(in=22049) (out=5015) (deflated 77%)
  adding: src/qd/mask/layers/vilt/gadgets/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/vilt/gadgets/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/vilt/gadgets/my_metrics.py	(in=2359) (out=573) (deflated 76%)
  adding: src/qd/mask/layers/vilt/gadgets/__init__.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/vilt/utils/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/layers/vilt/utils/write_coco_karpathy.py	(in=1904) (out=745) (deflated 61%)
  adding: src/qd/mask/layers/vilt/utils/write_f30k_karpathy.py	(in=1871) (out=736) (deflated 61%)
  adding: src/qd/mask/layers/vilt/utils/glossary.py	(in=4435) (out=1230) (deflated 72%)
  adding: src/qd/mask/layers/vilt/utils/write_vqa.py	(in=6523) (out=1678) (deflated 74%)
  adding: src/qd/mask/layers/vilt/utils/write_sbu.py	(in=1785) (out=712) (deflated 60%)
  adding: src/qd/mask/layers/vilt/utils/write_conceptual_caption.py	(in=2037) (out=761) (deflated 63%)
  adding: src/qd/mask/layers/vilt/utils/write_nlvr2.py	(in=2818) (out=851) (deflated 70%)
  adding: src/qd/mask/layers/vilt/utils/write_vg.py	(in=1928) (out=754) (deflated 61%)
  adding: src/qd/mask/layers/misc.py	(in=6021) (out=1660) (deflated 72%)
  adding: src/qd/mask/layers/nms.py	(in=618) (out=317) (deflated 49%)
  adding: src/qd/mask/layers/roi_align.py	(in=2142) (out=643) (deflated 70%)
  adding: src/qd/mask/layers/iou_loss.py	(in=1961) (out=649) (deflated 67%)
  adding: src/qd/mask/layers/sigmoid_focal_loss.py	(in=2374) (out=776) (deflated 67%)
  adding: src/qd/mask/layers/roi_pool.py	(in=1887) (out=609) (deflated 68%)
  adding: src/qd/mask/layers/smooth_l1_loss.py	(in=481) (out=293) (deflated 39%)
  adding: src/qd/mask/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/__init__.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/data/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/data/samplers/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/data/samplers/iteration_based_batch_sampler.py	(in=1164) (out=456) (deflated 61%)
  adding: src/qd/mask/data/samplers/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/data/samplers/__init__.py	(in=328) (out=170) (deflated 48%)
  adding: src/qd/mask/data/samplers/grouped_batch_sampler.py	(in=4845) (out=1645) (deflated 66%)
  adding: src/qd/mask/data/samplers/distributed.py	(in=3754) (out=1327) (deflated 65%)
  adding: src/qd/mask/data/transforms/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/data/transforms/build.py	(in=5207) (out=1163) (deflated 78%)
  adding: src/qd/mask/data/transforms/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/data/transforms/__init__.py	(in=284) (out=149) (deflated 48%)
  adding: src/qd/mask/data/transforms/transforms.py	(in=3085) (out=888) (deflated 71%)
  adding: src/qd/mask/data/build.py	(in=7324) (out=2393) (deflated 67%)
  adding: src/qd/mask/data/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/data/__init__.py	(in=108) (out=100) (deflated 7%)
  adding: src/qd/mask/data/collate_batch.py	(in=1080) (out=431) (deflated 60%)
  adding: src/qd/mask/data/datasets/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/data/datasets/concat_dataset.py	(in=766) (out=323) (deflated 58%)
  adding: src/qd/mask/data/datasets/list_dataset.py	(in=936) (out=444) (deflated 53%)
  adding: src/qd/mask/data/datasets/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/data/datasets/__init__.py	(in=458) (out=213) (deflated 53%)
  adding: src/qd/mask/data/datasets/masktsvdataset.py	(in=13707) (out=2874) (deflated 79%)
  adding: src/qd/mask/data/datasets/caption_tsv.py	(in=21070) (out=3933) (deflated 81%)
  adding: src/qd/mask/data/datasets/caption_tensorizer.py	(in=65072) (out=6209) (deflated 90%)
  adding: src/qd/mask/data/datasets/coco.py	(in=3616) (out=1265) (deflated 65%)
  adding: src/qd/mask/data/datasets/evaluation/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/data/datasets/evaluation/voc/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/data/datasets/evaluation/voc/__init__.py	(in=505) (out=240) (deflated 52%)
  adding: src/qd/mask/data/datasets/evaluation/voc/voc_eval.py	(in=8085) (out=2452) (deflated 70%)
  adding: src/qd/mask/data/datasets/evaluation/__init__.py	(in=994) (out=429) (deflated 57%)
  adding: src/qd/mask/data/datasets/evaluation/coco/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/data/datasets/evaluation/coco/__init__.py	(in=494) (out=171) (deflated 65%)
  adding: src/qd/mask/data/datasets/evaluation/coco/coco_eval.py	(in=13715) (out=3680) (deflated 73%)
  adding: src/qd/mask/data/datasets/utils/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/data/datasets/utils/utils_glue.py	(in=37206) (out=5527) (deflated 85%)
  adding: src/qd/mask/data/datasets/utils/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/data/datasets/utils/config_args.py	(in=3111) (out=1010) (deflated 68%)
  adding: src/qd/mask/data/datasets/utils/image_ops.py	(in=327) (out=192) (deflated 41%)
  adding: src/qd/mask/data/datasets/utils/box_label_loader.py	(in=4775) (out=1358) (deflated 72%)
  adding: src/qd/mask/data/datasets/utils/load_files.py	(in=2290) (out=653) (deflated 71%)
  adding: src/qd/mask/data/datasets/voc.py	(in=4114) (out=1407) (deflated 66%)
  adding: src/qd/mask/data/README.md	(in=2763) (out=1014) (deflated 63%)
  adding: src/qd/mask/solver/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/solver/lr_scheduler.py	(in=3423) (out=857) (deflated 75%)
  adding: src/qd/mask/solver/build.py	(in=3373) (out=1088) (deflated 68%)
  adding: src/qd/mask/solver/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/solver/__init__.py	(in=476) (out=210) (deflated 56%)
  adding: src/qd/mask/solver/LARC.py	(in=3976) (out=1277) (deflated 68%)
  adding: src/qd/mask/solver/optimization.py	(in=9530) (out=2563) (deflated 73%)
  adding: src/qd/mask/config/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/config/defaults.py	(in=21957) (out=6419) (deflated 71%)
  adding: src/qd/mask/config/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/config/paths_catalog.py	(in=8606) (out=2026) (deflated 76%)
  adding: src/qd/mask/config/__init__.py	(in=139) (out=105) (deflated 24%)
  adding: src/qd/mask/utils/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/utils/transforms/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/utils/transforms/build.py	(in=5207) (out=1163) (deflated 78%)
  adding: src/qd/mask/utils/transforms/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/utils/transforms/__init__.py	(in=284) (out=149) (deflated 48%)
  adding: src/qd/mask/utils/transforms/transforms.py	(in=3085) (out=888) (deflated 71%)
  adding: src/qd/mask/utils/model_serialization.py	(in=4024) (out=1492) (deflated 63%)
  adding: src/qd/mask/utils/timer.py	(in=1127) (out=414) (deflated 63%)
  adding: src/qd/mask/utils/checkpoint.py	(in=5536) (out=1552) (deflated 72%)
  adding: src/qd/mask/utils/miscellaneous.py	(in=228) (out=160) (deflated 30%)
  adding: src/qd/mask/utils/build.py	(in=7324) (out=2393) (deflated 67%)
  adding: src/qd/mask/utils/metric_logger.py	(in=2254) (out=787) (deflated 65%)
  adding: src/qd/mask/utils/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/utils/__init__.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/utils/collate_batch.py	(in=1080) (out=431) (deflated 60%)
  adding: src/qd/mask/utils/imports.py	(in=843) (out=382) (deflated 55%)
  adding: src/qd/mask/utils/logger.py	(in=783) (out=354) (deflated 55%)
  adding: src/qd/mask/utils/cv2_util.py	(in=640) (out=289) (deflated 55%)
  adding: src/qd/mask/utils/collect_env.py	(in=338) (out=203) (deflated 40%)
  adding: src/qd/mask/utils/c2_model_loading.py	(in=8514) (out=2112) (deflated 75%)
  adding: src/qd/mask/utils/model_zoo.py	(in=3031) (out=1292) (deflated 57%)
  adding: src/qd/mask/utils/comm.py	(in=3804) (out=1302) (deflated 66%)
  adding: src/qd/mask/utils/README.md	(in=175) (out=119) (deflated 32%)
  adding: src/qd/mask/utils/registry.py	(in=1385) (out=537) (deflated 61%)
  adding: src/qd/mask/utils/env.py	(in=1249) (out=522) (deflated 58%)
  adding: src/qd/mask/modeling/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/poolers.py	(in=4821) (out=1707) (deflated 65%)
  adding: src/qd/mask/modeling/matcher.py	(in=5268) (out=1668) (deflated 68%)
  adding: src/qd/mask/modeling/utils.py	(in=400) (out=247) (deflated 38%)
  adding: src/qd/mask/modeling/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/detector/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/detector/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/detector/__init__.py	(in=117) (out=104) (deflated 11%)
  adding: src/qd/mask/modeling/detector/generalized_rcnn.py	(in=2766) (out=1036) (deflated 63%)
  adding: src/qd/mask/modeling/detector/detectors.py	(in=324) (out=207) (deflated 36%)
  adding: src/qd/mask/modeling/box_coder.py	(in=3367) (out=1011) (deflated 70%)
  adding: src/qd/mask/modeling/make_layers.py	(in=3768) (out=1195) (deflated 68%)
  adding: src/qd/mask/modeling/__init__.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/backbone/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/backbone/mobilenet.py	(in=4610) (out=1304) (deflated 72%)
  adding: src/qd/mask/modeling/backbone/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/backbone/resnet.py	(in=15687) (out=3361) (deflated 79%)
  adding: src/qd/mask/modeling/backbone/fpn.py	(in=3906) (out=1261) (deflated 68%)
  adding: src/qd/mask/modeling/backbone/__init__.py	(in=129) (out=105) (deflated 19%)
  adding: src/qd/mask/modeling/backbone/fbnet.py	(in=7824) (out=2129) (deflated 73%)
  adding: src/qd/mask/modeling/backbone/fbnet_builder.py	(in=24950) (out=4940) (deflated 80%)
  adding: src/qd/mask/modeling/backbone/fbnet_modeldef.py	(in=5985) (out=857) (deflated 86%)
  adding: src/qd/mask/modeling/backbone/backbone.py	(in=4442) (out=931) (deflated 79%)
  adding: src/qd/mask/modeling/utils_caption_evaluate.py	(in=14409) (out=4505) (deflated 69%)
  adding: src/qd/mask/modeling/rpn/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/rpn/learnable_anchor_generator.py	(in=6232) (out=2144) (deflated 66%)
  adding: src/qd/mask/modeling/rpn/rpn.py	(in=7958) (out=2104) (deflated 74%)
  adding: src/qd/mask/modeling/rpn/utils.py	(in=1679) (out=634) (deflated 62%)
  adding: src/qd/mask/modeling/rpn/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/rpn/__init__.py	(in=101) (out=96) (deflated 5%)
  adding: src/qd/mask/modeling/rpn/loss.py	(in=10613) (out=2871) (deflated 73%)
  adding: src/qd/mask/modeling/rpn/anchor_generator.py	(in=10241) (out=3158) (deflated 69%)
  adding: src/qd/mask/modeling/rpn/retinanet/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/rpn/retinanet/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/rpn/retinanet/__init__.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/rpn/retinanet/loss.py	(in=3435) (out=1046) (deflated 70%)
  adding: src/qd/mask/modeling/rpn/retinanet/inference.py	(in=6881) (out=1978) (deflated 71%)
  adding: src/qd/mask/modeling/rpn/retinanet/retinanet.py	(in=5293) (out=1554) (deflated 71%)
  adding: src/qd/mask/modeling/rpn/inference.py	(in=10497) (out=2689) (deflated 74%)
  adding: src/qd/mask/modeling/rpn/fcos/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/rpn/fcos/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/rpn/fcos/__init__.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/rpn/fcos/loss.py	(in=11284) (out=2989) (deflated 74%)
  adding: src/qd/mask/modeling/rpn/fcos/fcos.py	(in=7492) (out=1982) (deflated 74%)
  adding: src/qd/mask/modeling/rpn/fcos/inference.py	(in=6826) (out=1886) (deflated 72%)
  adding: src/qd/mask/modeling/balanced_positive_negative_sampler.py	(in=4221) (out=1174) (deflated 72%)
  adding: src/qd/mask/modeling/roi_heads/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/roi_heads/mask_head/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/roi_heads/mask_head/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/roi_heads/mask_head/__init__.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/roi_heads/mask_head/loss.py	(in=5339) (out=1749) (deflated 67%)
  adding: src/qd/mask/modeling/roi_heads/mask_head/mask_head.py	(in=3133) (out=1076) (deflated 66%)
  adding: src/qd/mask/modeling/roi_heads/mask_head/roi_mask_feature_extractors.py	(in=2481) (out=898) (deflated 64%)
  adding: src/qd/mask/modeling/roi_heads/mask_head/roi_mask_predictors.py	(in=2208) (out=673) (deflated 70%)
  adding: src/qd/mask/modeling/roi_heads/mask_head/inference.py	(in=6549) (out=2359) (deflated 64%)
  adding: src/qd/mask/modeling/roi_heads/attribute_head/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/roi_heads/attribute_head/roi_attribute_feature_extractors.py	(in=734) (out=312) (deflated 57%)
  adding: src/qd/mask/modeling/roi_heads/attribute_head/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/roi_heads/attribute_head/__init__.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/roi_heads/attribute_head/loss.py	(in=1987) (out=781) (deflated 61%)
  adding: src/qd/mask/modeling/roi_heads/attribute_head/inference.py	(in=4210) (out=1517) (deflated 64%)
  adding: src/qd/mask/modeling/roi_heads/attribute_head/attribute_head.py	(in=4510) (out=1440) (deflated 68%)
  adding: src/qd/mask/modeling/roi_heads/attribute_head/roi_attribute_predictors.py	(in=3279) (out=762) (deflated 77%)
  adding: src/qd/mask/modeling/roi_heads/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/roi_heads/__init__.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/roi_heads/keypoint_head/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/roi_heads/keypoint_head/roi_keypoint_feature_extractors.py	(in=1871) (out=714) (deflated 62%)
  adding: src/qd/mask/modeling/roi_heads/keypoint_head/keypoint_head.py	(in=2057) (out=658) (deflated 68%)
  adding: src/qd/mask/modeling/roi_heads/keypoint_head/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/roi_heads/keypoint_head/__init__.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/roi_heads/keypoint_head/loss.py	(in=7055) (out=2143) (deflated 70%)
  adding: src/qd/mask/modeling/roi_heads/keypoint_head/inference.py	(in=4454) (out=1649) (deflated 63%)
  adding: src/qd/mask/modeling/roi_heads/keypoint_head/roi_keypoint_predictors.py	(in=1259) (out=526) (deflated 58%)
  adding: src/qd/mask/modeling/roi_heads/box_head/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/roi_heads/box_head/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/roi_heads/box_head/__init__.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/roi_heads/box_head/loss.py	(in=14045) (out=3361) (deflated 76%)
  adding: src/qd/mask/modeling/roi_heads/box_head/box_head.py	(in=3596) (out=1121) (deflated 69%)
  adding: src/qd/mask/modeling/roi_heads/box_head/inference.py	(in=20277) (out=4499) (deflated 78%)
  adding: src/qd/mask/modeling/roi_heads/box_head/roi_box_predictors.py	(in=3860) (out=1031) (deflated 73%)
  adding: src/qd/mask/modeling/roi_heads/box_head/roi_box_feature_extractors.py	(in=5830) (out=1382) (deflated 76%)
  adding: src/qd/mask/modeling/roi_heads/roi_heads.py	(in=4280) (out=1069) (deflated 75%)
  adding: src/qd/mask/modeling/captioning/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/utils_data.py	(in=3109) (out=915) (deflated 71%)
  adding: src/qd/mask/modeling/captioning/utils_cbs.py	(in=40304) (out=10428) (deflated 74%)
  adding: src/qd/mask/modeling/captioning/utils.py	(in=1670) (out=540) (deflated 68%)
  adding: src/qd/mask/modeling/captioning/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/__init__.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/params.json	(in=151) (out=100) (deflated 34%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/eval.py	(in=1426) (out=467) (deflated 67%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/__init__.py	(in=21) (out=21) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/cider/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/cider/__init__.py	(in=21) (out=21) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/cider/cider_scorer.py	(in=8234) (out=2585) (deflated 69%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/cider/cider.py	(in=1890) (out=822) (deflated 57%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/tmpql9uU7	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/ptbtokenizer.py	(in=3889) (out=1240) (deflated 68%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/stanford-corenlp-3.4.1.jar 	(in=5921410) (out=5400206) (deflated 9%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/tmpzNW4I2	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/tmpBF49XX	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/__init__.py	(in=21) (out=21) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/tmpxAmV_C	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/tmpuCp_T0	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/ciderD/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/ciderD/ciderD_scorer.py	(in=8860) (out=2746) (deflated 69%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/ciderD/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/ciderD/__init__.py	(in=21) (out=21) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/ciderD/ciderD.py	(in=1968) (out=868) (deflated 56%)
  adding: src/qd/mask/modeling/captioning/cider/pydataformat/	(in=0) (out=0) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pydataformat/__init__.py	(in=20) (out=20) (stored 0%)
  adding: src/qd/mask/modeling/captioning/cider/pydataformat/loadData.py	(in=883) (out=377) (deflated 57%)
  adding: src/qd/mask/modeling/captioning/cider/pydataformat/jsonify_refs.py	(in=1159) (out=480) (deflated 59%)
  adding: src/qd/mask/modeling/captioning/cider/cidereval.ipynb	(in=3034) (out=896) (deflated 70%)
  adding: src/qd/mask/modeling/captioning/cider/license.txt	(in=1561) (out=850) (deflated 46%)
  adding: src/qd/mask/modeling/captioning/cider/README.md	(in=2738) (out=1255) (deflated 54%)
  adding: src/qd/mask/modeling/captioning/cider/cidereval.py	(in=1356) (out=585) (deflated 57%)
  adding: src/qd/mask/modeling/captioning/captioning_e2e.py	(in=9777) (out=2724) (deflated 72%)
  adding: src/qd/mask/modeling/captioning/utils_caption_evaluate.py	(in=14512) (out=4578) (deflated 68%)
  adding: src/qd/mask/modeling/captioning/scan_utils.py	(in=18086) (out=4385) (deflated 76%)
  adding: src/qd/mask/modeling/captioning/utils_solver.py	(in=1241) (out=428) (deflated 66%)
  adding: src/qd/mask/modeling/captioning/scan.py	(in=13928) (out=3532) (deflated 75%)
  adding: src/qd/mask/modeling/registry.py	(in=476) (out=201) (deflated 58%)
  adding: src/qd/db.py	(in=23452) (out=5174) (deflated 78%)
  adding: src/qd/layers/	(in=0) (out=0) (stored 0%)
  adding: src/qd/layers/kl_entropy.py	(in=1598) (out=608) (deflated 62%)
  adding: src/qd/layers/reshape_batch_norm.py	(in=2037) (out=452) (deflated 78%)
  adding: src/qd/layers/flops_count.py	(in=2350) (out=696) (deflated 70%)
  adding: src/qd/layers/kl_div_logit_loss.py	(in=585) (out=298) (deflated 49%)
  adding: src/qd/layers/resnet_vl.py	(in=16876) (out=3354) (deflated 80%)
  adding: src/qd/layers/shufflenet.py	(in=3044) (out=733) (deflated 76%)
  adding: src/qd/layers/efficient_det2.py	(in=28346) (out=6864) (deflated 76%)
  adding: src/qd/layers/mobilenetv3.py	(in=9014) (out=2217) (deflated 75%)
  adding: src/qd/layers/non_local_net/	(in=0) (out=0) (stored 0%)
  adding: src/qd/layers/non_local_net/readme.txt	(in=46) (out=46) (stored 0%)
  adding: src/qd/layers/non_local_net/non_local_gaussian.py	(in=4915) (out=1040) (deflated 79%)
  adding: src/qd/layers/non_local_net/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/layers/non_local_net/non_local_concatenation.py	(in=5512) (out=1154) (deflated 79%)
  adding: src/qd/layers/non_local_net/non_local_dot_product.py	(in=5087) (out=1020) (deflated 80%)
  adding: src/qd/layers/non_local_net/non_local_embedded_gaussian.py	(in=4597) (out=910) (deflated 80%)
  adding: src/qd/layers/softmaxtree.py	(in=682) (out=273) (deflated 60%)
  adding: src/qd/layers/image_text_align.py	(in=12435) (out=2801) (deflated 77%)
  adding: src/qd/layers/forward_pass_time_checker.py	(in=2282) (out=745) (deflated 67%)
  adding: src/qd/layers/yolov5.py	(in=16886) (out=5564) (deflated 67%)
  adding: src/qd/layers/boxlist_nms.py	(in=6241) (out=1142) (deflated 82%)
  adding: src/qd/layers/forward_pass_feature_cache.py	(in=3209) (out=943) (deflated 71%)
  adding: src/qd/layers/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/layers/resnet.py	(in=14799) (out=2870) (deflated 81%)
  adding: src/qd/layers/mitorch_models/	(in=0) (out=0) (stored 0%)
  adding: src/qd/layers/mitorch_models/efficientnet.py	(in=3541) (out=995) (deflated 72%)
  adding: src/qd/layers/mitorch_models/shufflenet.py	(in=4311) (out=1059) (deflated 75%)
  adding: src/qd/layers/mitorch_models/resnext.py	(in=3382) (out=919) (deflated 73%)
  adding: src/qd/layers/mitorch_models/mobilenetv3.py	(in=4249) (out=727) (deflated 83%)
  adding: src/qd/layers/mitorch_models/mobilenetv2.py	(in=3323) (out=758) (deflated 77%)
  adding: src/qd/layers/mitorch_models/vgg.py	(in=9394) (out=719) (deflated 92%)
  adding: src/qd/layers/mitorch_models/ssd_lite.py	(in=1661) (out=548) (deflated 67%)
  adding: src/qd/layers/mitorch_models/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/layers/mitorch_models/__init__.py	(in=923) (out=330) (deflated 64%)
  adding: src/qd/layers/mitorch_models/factory.py	(in=11197) (out=1199) (deflated 89%)
  adding: src/qd/layers/mitorch_models/feature_pyramid_network.py	(in=7059) (out=1041) (deflated 85%)
  adding: src/qd/layers/mitorch_models/model.py	(in=2380) (out=863) (deflated 64%)
  adding: src/qd/layers/mitorch_models/modules/	(in=0) (out=0) (stored 0%)
  adding: src/qd/layers/mitorch_models/modules/base.py	(in=144) (out=99) (deflated 31%)
  adding: src/qd/layers/mitorch_models/modules/convolution.py	(in=2309) (out=606) (deflated 74%)
  adding: src/qd/layers/mitorch_models/modules/prior_box.py	(in=2162) (out=714) (deflated 67%)
  adding: src/qd/layers/mitorch_models/modules/mbconv.py	(in=1312) (out=454) (deflated 65%)
  adding: src/qd/layers/mitorch_models/modules/shuffle.py	(in=497) (out=232) (deflated 53%)
  adding: src/qd/layers/mitorch_models/modules/se_block.py	(in=969) (out=400) (deflated 59%)
  adding: src/qd/layers/mitorch_models/modules/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/layers/mitorch_models/modules/retina_prior_box.py	(in=913) (out=363) (deflated 60%)
  adding: src/qd/layers/mitorch_models/modules/__init__.py	(in=497) (out=219) (deflated 56%)
  adding: src/qd/layers/mitorch_models/modules/focal_loss.py	(in=2816) (out=809) (deflated 71%)
  adding: src/qd/layers/mitorch_models/modules/retina_predictor.py	(in=460) (out=205) (deflated 55%)
  adding: src/qd/layers/mitorch_models/modules/ssd_predictor.py	(in=3127) (out=950) (deflated 70%)
  adding: src/qd/layers/mitorch_models/modules/activation.py	(in=387) (out=150) (deflated 61%)
  adding: src/qd/layers/mitorch_models/modules/ssd_loss.py	(in=9847) (out=2659) (deflated 73%)
  adding: src/qd/layers/mitorch_models/modules/addition.py	(in=237) (out=139) (deflated 41%)
  adding: src/qd/layers/mitorch_models/modules/linear.py	(in=346) (out=174) (deflated 50%)
  adding: src/qd/layers/mitorch_models/modules/non_max_suppression.py	(in=3545) (out=1048) (deflated 70%)
  adding: src/qd/layers/mitorch_models/classifier.py	(in=354) (out=173) (deflated 51%)
  adding: src/qd/layers/mitorch_models/seresnext.py	(in=2054) (out=529) (deflated 74%)
  adding: src/qd/layers/mitorch_models/bidirectional_feature_pyramid_network.py	(in=5167) (out=1227) (deflated 76%)
  adding: src/qd/layers/mitorch_models/ssdlite_extra_layers.py	(in=3847) (out=782) (deflated 80%)
  adding: src/qd/layers/mitorch_models/squeezenet.py	(in=1933) (out=550) (deflated 72%)
  adding: src/qd/layers/mitorch_models/shufflenetv2.py	(in=4143) (out=986) (deflated 76%)
  adding: src/qd/layers/mitorch_models/retinanet.py	(in=2252) (out=675) (deflated 70%)
  adding: src/qd/layers/__init__.py	(in=338) (out=160) (deflated 53%)
  adding: src/qd/layers/ssfpn.py	(in=5639) (out=1751) (deflated 69%)
  adding: src/qd/layers/batch_norm.py	(in=8147) (out=2098) (deflated 74%)
  adding: src/qd/layers/loss.py	(in=22102) (out=4109) (deflated 81%)
  adding: src/qd/layers/group_batch_norm.py	(in=1550) (out=485) (deflated 69%)
  adding: src/qd/layers/adapt_avg_pool2d.py	(in=620) (out=291) (deflated 53%)
  adding: src/qd/layers/forward_pass_memory_checker.py	(in=2338) (out=774) (deflated 67%)
  adding: src/qd/layers/forward_image_model.py	(in=225) (out=138) (deflated 39%)
  adding: src/qd/layers/create_layer.py	(in=142) (out=107) (deflated 25%)
  adding: src/qd/layers/efficient_det.py 	(in=93199) (out=18170) (deflated 81%)
  adding: src/qd/layers/smooth_l1_loss.py	(in=1079) (out=418) (deflated 61%)
  adding: src/qd/layers/standarized_conv.py	(in=1539) (out=472) (deflated 69%)
  adding: src/qd/layers/tensor_queue.py	(in=1165) (out=471) (deflated 60%)
  adding: src/qd/layers/feature_extract.py	(in=2216) (out=641) (deflated 71%)
  adding: src/qd/layers/merge_batch_norm.py	(in=4277) (out=1302) (deflated 70%)
  adding: src/qd/layers/ntxent_loss.py	(in=15841) (out=3249) (deflated 79%)
  adding: src/qd/layers/precise_bn.py	(in=4286) (out=722) (deflated 83%)
  adding: src/qd/process_tsv.py 	(in=315680) (out=60053) (deflated 81%)
  adding: src/qd/compile/	(in=0) (out=0) (stored 0%)
  adding: src/qd/compile/gcc_ignore.py	(in=2131) (out=956) (deflated 55%)
  adding: src/qd/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/prep_dataset/	(in=0) (out=0) (stored 0%)
  adding: src/qd/prep_dataset/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/prep_dataset/vlp_version.py	(in=1791) (out=714) (deflated 60%)
  adding: src/qd/prep_dataset/open_image_v5c.py	(in=14437) (out=3417) (deflated 76%)
  adding: src/qd/prep_dataset/wider_face.py	(in=1996) (out=765) (deflated 62%)
  adding: src/qd/prep_dataset/prep_coco_2017.py	(in=6367) (out=1565) (deflated 75%)
  adding: src/qd/prep_dataset/clean_label.py	(in=7181) (out=1860) (deflated 74%)
  adding: src/qd/prep_dataset/vizwiz.py	(in=4752) (out=991) (deflated 79%)
  adding: src/qd/prep_dataset/open_image_v6_det.py	(in=14745) (out=3485) (deflated 76%)
  adding: src/qd/prep_dataset/build_tax_data.py	(in=35401) (out=3365) (deflated 90%)
  adding: src/qd/gpu_util.py	(in=3780) (out=1082) (deflated 71%)
  adding: src/qd/pipeline_runner.py	(in=7501) (out=1882) (deflated 75%)
  adding: src/qd/__init__.py	(in=0) (out=0) (stored 0%)
  adding: src/qd/tsv_io.py	(in=50610) (out=11013) (deflated 78%)
  adding: src/qd/data_layer/	(in=0) (out=0) (stored 0%)
  adding: src/qd/data_layer/batch_kmeans.py	(in=6645) (out=1870) (deflated 72%)
  adding: src/qd/data_layer/dataset.py	(in=9251) (out=1937) (deflated 79%)
  adding: src/qd/data_layer/transform.py 	(in=75960) (out=14363) (deflated 81%)
  adding: src/qd/data_layer/autoaugmentation.py	(in=11211) (out=1846) (deflated 84%)
  adding: src/qd/data_layer/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/data_layer/builder.py	(in=14785) (out=3166) (deflated 79%)
  adding: src/qd/data_layer/samplers.py	(in=7262) (out=2073) (deflated 71%)
  adding: src/qd/data_layer/rand_augmentation.py	(in=29487) (out=6586) (deflated 78%)
  adding: src/qd/data_layer/loader.py	(in=871) (out=285) (deflated 67%)
  adding: src/qd/remote_run.py	(in=6483) (out=1689) (deflated 74%)
  adding: src/qd/deteval.py	(in=26011) (out=6577) (deflated 75%)
  adding: src/qd/logger.py	(in=3511) (out=976) (deflated 72%)
  adding: src/qd/pytablemd.py	(in=3412) (out=1223) (deflated 64%)
  adding: src/qd/examples/	(in=0) (out=0) (stored 0%)
  adding: src/qd/examples/efficient_det0.py	(in=2143) (out=900) (deflated 58%)
  adding: src/qd/opt/	(in=0) (out=0) (stored 0%)
  adding: src/qd/opt/checkpoint.py	(in=14958) (out=4595) (deflated 69%)
  adding: src/qd/opt/sampler.py	(in=3528) (out=967) (deflated 73%)
  adding: src/qd/opt/sgd.py	(in=832) (out=337) (deflated 59%)
  adding: src/qd/opt/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/opt/__init__.py	(in=71) (out=61) (deflated 14%)
  adding: src/qd/opt/ema_optimizer.py	(in=2790) (out=882) (deflated 68%)
  adding: src/qd/opt/WarmupCosineAnnealingLR.py	(in=2418) (out=653) (deflated 73%)
  adding: src/qd/opt/trainer.py	(in=29302) (out=5036) (deflated 83%)
  adding: src/qd/gpucluster/	(in=0) (out=0) (stored 0%)
  adding: src/qd/gpucluster/aml_client.py	(in=52390) (out=11691) (deflated 78%)
  adding: src/qd/gpucluster/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/gpucluster/__init__.py	(in=123) (out=74) (deflated 40%)
  adding: src/qd/gpucluster/philly_client.py	(in=59677) (out=14947) (deflated 75%)
  adding: src/qd/gpucluster/aml_server.py	(in=10236) (out=3472) (deflated 66%)
  adding: src/qd/gpucluster/philly_server.py	(in=8651) (out=2962) (deflated 66%)
  adding: src/qd/gpucluster/README.md	(in=20017) (out=6882) (deflated 66%)
  adding: src/qd/gpucluster/aux_data	(in=37) (out=37) (stored 0%)
  adding: src/qd/qd_caffe.py	(in=23661) (out=5459) (deflated 77%)
  adding: src/qd/cloud_storage.py	(in=25121) (out=5551) (deflated 78%)
  adding: src/qd/pipelines/	(in=0) (out=0) (stored 0%)
  adding: src/qd/pipelines/test	(in=40326) (out=8930) (deflated 78%)
  adding: src/qd/pipelines/caption_uni_pipeline.py	(in=28442) (out=6659) (deflated 77%)
  adding: src/qd/pipelines/multi_scale/	(in=0) (out=0) (stored 0%)
  adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_multi_tower.py	(in=27974) (out=6487) (deflated 77%)
  adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_two_tower.py	(in=27387) (out=6467) (deflated 76%)
  adding: src/qd/pipelines/multi_scale/multi_scale_vlp_uni_pipeline_jf.py	(in=23762) (out=4858) (deflated 80%)
  adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_token_drop.py	(in=26366) (out=6310) (deflated 76%)
  adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_encdec_nocaps.py	(in=38404) (out=8847) (deflated 77%)
  adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_recurrent_training.py	(in=33081) (out=7831) (deflated 76%)
  adding: src/qd/pipelines/multi_scale/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/pipelines/multi_scale/others/	(in=0) (out=0) (stored 0%)
  adding: src/qd/pipelines/multi_scale/others/multi_scale_vqa_uni_pipeline.py	(in=24525) (out=6219) (deflated 75%)
  adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_mutual_tower.py	(in=27100) (out=6449) (deflated 76%)
  adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_attention_select.py	(in=40770) (out=8410) (deflated 79%)
  adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_clip.py	(in=26088) (out=6190) (deflated 76%)
  adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline.py	(in=42101) (out=9200) (deflated 78%)
  adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_encdec.py	(in=43151) (out=9675) (deflated 78%)
  adding: src/qd/pipelines/multi_scale/caption_uni_pipeline_bbox.py	(in=60683) (out=11922) (deflated 80%)
  adding: src/qd/pipelines/ViT_tagger_uni_pipeline_vis.py	(in=42590) (out=10051) (deflated 76%)
  adding: src/qd/pipelines/tagger_caption_uni_pipeline_expanding_bertemb.py	(in=47335) (out=11193) (deflated 76%)
  adding: src/qd/pipelines/tagger_caption_uni_pipeline.py	(in=46538) (out=11217) (deflated 76%)
  adding: src/qd/pipelines/ViT_tagger_uni_pipeline.py	(in=31661) (out=8025) (deflated 75%)
  adding: src/qd/pipelines/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/pipelines/distillation/	(in=0) (out=0) (stored 0%)
  adding: src/qd/pipelines/distillation/logit_distill_caption_uni_pipeline.py	(in=31204) (out=7376) (deflated 76%)
  adding: src/qd/pipelines/distillation/vlp_uni_pipeline_distill_proposal.py	(in=29656) (out=6867) (deflated 77%)
  adding: src/qd/pipelines/distillation/logit_distill_multi_scale_caption_uni_pipeline.py	(in=33730) (out=7763) (deflated 77%)
  adding: src/qd/pipelines/distillation/vlp_uni_pipeline_distill.py	(in=37235) (out=8160) (deflated 78%)
  adding: src/qd/pipelines/distillation/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/pipelines/distillation/multi_scale_distillation_caption_uni_pipeline_encdec.py	(in=28288) (out=6543) (deflated 77%)
  adding: src/qd/pipelines/distillation/vlp_uni_pipeline_distill_encoder_decoder.py	(in=33806) (out=7639) (deflated 77%)
  adding: src/qd/pipelines/distillation/vlp_uni_pipeline_distill_gumbel.py	(in=36730) (out=8459) (deflated 77%)
  adding: src/qd/pipelines/distillation/vlp_uni_pipeline_distill_with_tags.py 	(in=69857) (out=13401) (deflated 81%)
  adding: src/qd/pipelines/distillation/correct_distill_multi_scale_caption_uni_pipeline.py	(in=34062) (out=7956) (deflated 77%)
  adding: src/qd/pipelines/distillation/vqa_uni_pipeline_distill.py	(in=25027) (out=6281) (deflated 75%)
  adding: src/qd/pipelines/distillation/multi_scale_distillation_caption_uni_pipeline.py	(in=34318) (out=7997) (deflated 77%)
  adding: src/qd/pipelines/ViT_all_token_tagger_uni_pipeline.py	(in=27231) (out=7251) (deflated 73%)
  adding: src/qd/pipelines/tagger_caption_uni_pipeline_expanding_bertemb_vis.py	(in=51299) (out=11999) (deflated 77%)
  adding: src/qd/pipelines/tagger_caption_uni_pipeline_expanding_bertemb_distill.py	(in=56629) (out=12861) (deflated 77%)
  adding: src/qd/pipelines/others/	(in=0) (out=0) (stored 0%)
  adding: src/qd/pipelines/others/sim_clr.py	(in=5938) (out=1923) (deflated 68%)
  adding: src/qd/pipelines/others/kl_entropy_pipeline.py	(in=932) (out=394) (deflated 58%)
  adding: src/qd/pipelines/others/tap_uni_pipeline.py	(in=4162) (out=1220) (deflated 71%)
  adding: src/qd/pipelines/others/faster_rcnn_distill.py	(in=8417) (out=2139) (deflated 75%)
  adding: src/qd/pipelines/others/ocpretrain.py	(in=3590) (out=1315) (deflated 63%)
  adding: src/qd/pipelines/others/moco_distill.py	(in=32695) (out=7145) (deflated 78%)
  adding: src/qd/pipelines/others/checkpoint_zero_pipeline.py	(in=25473) (out=6271) (deflated 75%)
  adding: src/qd/pipelines/others/simple_vl.py	(in=3368) (out=1136) (deflated 66%)
  adding: src/qd/pipelines/others/efficient_det_distill.py	(in=6117) (out=1576) (deflated 74%)
  adding: src/qd/pipelines/others/classification_for_maskrcnn.py	(in=2602) (out=972) (deflated 63%)
  adding: src/qd/pipelines/others/m4c_tap.py	(in=13809) (out=3456) (deflated 75%)
  adding: src/qd/pipelines/others/mmask_pretrain.py	(in=20094) (out=5207) (deflated 74%)
  adding: src/qd/pipelines/others/image_text_retrieval.py	(in=32481) (out=7683) (deflated 76%)
  adding: src/qd/pipelines/others/auto_param.py	(in=38878) (out=7254) (deflated 81%)
  adding: src/qd/pipelines/others/multi_scale_vlp_uni_pipeline.py	(in=16066) (out=4140) (deflated 74%)
  adding: src/qd/pipelines/others/yolo_by_mask.py	(in=32567) (out=7973) (deflated 76%)
  adding: src/qd/pipelines/others/vqa.py	(in=29396) (out=7593) (deflated 74%)
  adding: src/qd/pipelines/others/tagger_caption_uni_pipeline_expanding_bertemb_gradient.py	(in=45852) (out=10864) (deflated 76%)
  adding: src/qd/pipelines/others/yolov5.py 	(in=73599) (out=22919) (deflated 69%)
  adding: src/qd/pipelines/others/soft_balanced.py	(in=40025) (out=7721) (deflated 81%)
  adding: src/qd/pipelines/others/s4_pipeline.py	(in=9581) (out=2696) (deflated 72%)
  adding: src/qd/pipelines/others/qd_mmdetection.py	(in=10131) (out=2816) (deflated 72%)
  adding: src/qd/pipelines/others/late_fusion_caption_uni_pipeline.py	(in=6150) (out=1754) (deflated 71%)
  adding: src/qd/pipelines/others/cluster_fit.py	(in=1531) (out=579) (deflated 62%)
  adding: src/qd/pipelines/others/fb_swav.py	(in=32879) (out=8783) (deflated 73%)
  adding: src/qd/pipelines/others/triplet_contrastive_pipeline.py	(in=10795) (out=3472) (deflated 68%)
  adding: src/qd/pipelines/others/clip_uni_pipeline.py	(in=22206) (out=5015) (deflated 77%)
  adding: src/qd/pipelines/others/e2e_caption.py	(in=44482) (out=9419) (deflated 79%)
  adding: src/qd/pipelines/others/fcos.py	(in=22424) (out=5299) (deflated 76%)
  adding: src/qd/pipelines/others/mm_detect.py	(in=8232) (out=2473) (deflated 70%)
  adding: src/qd/pipelines/others/heatmap_score_box.py	(in=8226) (out=2295) (deflated 72%)
  adding: src/qd/pipelines/others/fb_moco.py	(in=52221) (out=10031) (deflated 81%)
  adding: src/qd/pipelines/others/classification_by_maskrcnn.py	(in=21947) (out=5030) (deflated 77%)
  adding: src/qd/pipelines/others/detectron2.py	(in=17876) (out=5154) (deflated 71%)
  adding: src/qd/pipelines/others/pipeline_base.py	(in=167) (out=106) (deflated 37%)
  adding: src/qd/pipelines/others/slow_contrast.py	(in=19457) (out=4703) (deflated 76%)
  adding: src/qd/pipelines/others/fast_human_det.py	(in=10675) (out=2903) (deflated 73%)
  adding: src/qd/pipelines/others/usl_cmc.py	(in=14134) (out=3350) (deflated 76%)
  adding: src/qd/pipelines/others/clip.py	(in=32519) (out=7938) (deflated 76%)
  adding: src/qd/pipelines/others/cls_feature_extract_uni_pipeline.py	(in=3016) (out=1032) (deflated 66%)
  adding: src/qd/pipelines/others/vqa_uni_pipeline.py	(in=19771) (out=5266) (deflated 73%)
  adding: src/qd/pipelines/others/knn_classifier.py	(in=3450) (out=1109) (deflated 68%)
  adding: src/qd/pipelines/others/extract_spatial_before_avgpool.py	(in=774) (out=332) (deflated 57%)
  adding: src/qd/pipelines/others/mmask.py	(in=13738) (out=3557) (deflated 74%)
  adding: src/qd/pipelines/others/efficient_det_pipeline.py	(in=22394) (out=5094) (deflated 77%)
  adding: src/qd/pipelines/others/reppoint.py	(in=7625) (out=2334) (deflated 69%)
  adding: src/qd/pipelines/others/caption_uni_pipeline_distill.py	(in=30947) (out=7392) (deflated 76%)
  adding: src/qd/pipelines/others/det_clip_uni_pipeline.py	(in=530) (out=231) (deflated 56%)
  adding: src/qd/pipelines/others/cls_uni_pipeline.py	(in=3408) (out=1086) (deflated 68%)
  adding: src/qd/pipelines/others/mmask_caption.py	(in=45160) (out=9586) (deflated 79%)
  adding: src/qd/pipelines/others/distill_caption_uni_pipeline.py	(in=18506) (out=4059) (deflated 78%)
  adding: src/qd/pipelines/others/yolov2_pt.py	(in=6564) (out=1805) (deflated 73%)
  adding: src/qd/pipelines/others/contrastive_vlp_uni_pipeline.py	(in=10281) (out=2663) (deflated 74%)
  adding: src/qd/pipelines/others/vlp_uni_pipeline.py	(in=9724) (out=2366) (deflated 76%)
  adding: src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py	(in=53559) (out=12387) (deflated 77%)
  adding: src/qd/pipelines/Kim/	(in=0) (out=0) (stored 0%)
  adding: src/qd/pipelines/Kim/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/pipelines/Kim/kim_vilt_caption_uni_pipeline.py	(in=29311) (out=7167) (deflated 76%)
  adding: src/qd/pipelines/Kim/kim_vqa_uni_pipeline.py	(in=24356) (out=6448) (deflated 74%)
  adding: src/qd/pipelines/Kim/kim_vqa_logit_distill_uni_pipeline.py	(in=28242) (out=7251) (deflated 74%)
  adding: src/qd/pipelines/vqa_uni_pipeline.py	(in=19811) (out=5272) (deflated 73%)
  adding: src/qd/pipelines/uni_pipeline.py 	(in=81347) (out=16646) (deflated 80%)
  adding: src/qd/pipelines/vlp_uni_pipeline.py	(in=12680) (out=3056) (deflated 76%)
  adding: src/qd/latex_writer.py	(in=7828) (out=1846) (deflated 76%)
  adding: src/qd/hnms.py	(in=18222) (out=3055) (deflated 83%)
  adding: src/qd/process_dataset.py	(in=11742) (out=2804) (deflated 76%)
  adding: src/qd/project/	(in=0) (out=0) (stored 0%)
  adding: src/qd/project/text_aware_pre_training.py 	(in=66226) (out=11617) (deflated 82%)
  adding: src/qd/project/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/qd/project/semi_weak_pretrain.py 	(in=97677) (out=16466) (deflated 83%)
  adding: src/qd/project/general_vision_language.py	(in=25055) (out=6017) (deflated 76%)
  adding: src/qd/garbage_collector.py	(in=4639) (out=1269) (deflated 73%)
  adding: src/qd/demo_detection.py	(in=14115) (out=3641) (deflated 74%)
  adding: src/qd/qd_common.py 	(in=123355) (out=30491) (deflated 75%)
  adding: src/pytorch_image_models/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/optim/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/optim/radam.py	(in=5924) (out=1129) (deflated 81%)
  adding: src/pytorch_image_models/timm/optim/optim_factory.py	(in=4764) (out=1221) (deflated 74%)
  adding: src/pytorch_image_models/timm/optim/adahessian.py	(in=6535) (out=2197) (deflated 66%)
  adding: src/pytorch_image_models/timm/optim/nvnovograd.py	(in=4795) (out=1605) (deflated 67%)
  adding: src/pytorch_image_models/timm/optim/adamp.py	(in=3689) (out=1261) (deflated 66%)
  adding: src/pytorch_image_models/timm/optim/lookahead.py	(in=3815) (out=1185) (deflated 69%)
  adding: src/pytorch_image_models/timm/optim/__init__.py	(in=368) (out=158) (deflated 57%)
  adding: src/pytorch_image_models/timm/optim/sgdp.py	(in=3231) (out=1160) (deflated 64%)
  adding: src/pytorch_image_models/timm/optim/novograd.py	(in=2925) (out=943) (deflated 68%)
  adding: src/pytorch_image_models/timm/optim/adafactor.py	(in=8126) (out=2354) (deflated 71%)
  adding: src/pytorch_image_models/timm/optim/rmsprop_tf.py	(in=6127) (out=2017) (deflated 67%)
  adding: src/pytorch_image_models/timm/optim/adamw.py	(in=4965) (out=1603) (deflated 68%)
  adding: src/pytorch_image_models/timm/optim/nadam.py	(in=3758) (out=1304) (deflated 65%)
  adding: src/pytorch_image_models/timm/scheduler/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/scheduler/cosine_lr.py	(in=3977) (out=1196) (deflated 70%)
  adding: src/pytorch_image_models/timm/scheduler/scheduler.py	(in=4750) (out=1467) (deflated 69%)
  adding: src/pytorch_image_models/timm/scheduler/plateau_lr.py	(in=4140) (out=1274) (deflated 69%)
  adding: src/pytorch_image_models/timm/scheduler/tanh_lr.py	(in=4045) (out=1157) (deflated 71%)
  adding: src/pytorch_image_models/timm/scheduler/__init__.py	(in=206) (out=100) (deflated 51%)
  adding: src/pytorch_image_models/timm/scheduler/step_lr.py	(in=1902) (out=589) (deflated 69%)
  adding: src/pytorch_image_models/timm/scheduler/scheduler_factory.py	(in=3268) (out=654) (deflated 80%)
  adding: src/pytorch_image_models/timm/loss/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/loss/jsd.py	(in=1595) (out=747) (deflated 53%)
  adding: src/pytorch_image_models/timm/loss/cross_entropy.py	(in=1082) (out=393) (deflated 64%)
  adding: src/pytorch_image_models/timm/loss/__init__.py	(in=191) (out=116) (deflated 39%)
  adding: src/pytorch_image_models/timm/loss/asymmetric_loss.py	(in=3322) (out=1001) (deflated 70%)
  adding: src/pytorch_image_models/timm/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/__init__.py	(in=189) (out=108) (deflated 43%)
  adding: src/pytorch_image_models/timm/data/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/data/mixup.py	(in=14711) (out=3297) (deflated 78%)
  adding: src/pytorch_image_models/timm/data/config.py	(in=2756) (out=782) (deflated 72%)
  adding: src/pytorch_image_models/timm/data/parsers/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/data/parsers/parser_tfds.py	(in=10443) (out=3703) (deflated 65%)
  adding: src/pytorch_image_models/timm/data/parsers/constants.py	(in=43) (out=38) (deflated 12%)
  adding: src/pytorch_image_models/timm/data/parsers/parser_factory.py	(in=1116) (out=480) (deflated 57%)
  adding: src/pytorch_image_models/timm/data/parsers/parser_image_tar.py	(in=2589) (out=1005) (deflated 61%)
  adding: src/pytorch_image_models/timm/data/parsers/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/data/parsers/__init__.py	(in=42) (out=40) (deflated 5%)
  adding: src/pytorch_image_models/timm/data/parsers/class_map.py	(in=571) (out=269) (deflated 53%)
  adding: src/pytorch_image_models/timm/data/parsers/parser_image_folder.py	(in=2508) (out=983) (deflated 61%)
  adding: src/pytorch_image_models/timm/data/parsers/parser_image_in_tar.py	(in=8987) (out=2855) (deflated 68%)
  adding: src/pytorch_image_models/timm/data/parsers/parser.py	(in=487) (out=174) (deflated 64%)
  adding: src/pytorch_image_models/timm/data/dataset.py	(in=4506) (out=1302) (deflated 71%)
  adding: src/pytorch_image_models/timm/data/dataset_factory.py	(in=1057) (out=433) (deflated 59%)
  adding: src/pytorch_image_models/timm/data/constants.py	(in=303) (out=153) (deflated 50%)
  adding: src/pytorch_image_models/timm/data/tf_preprocessing.py	(in=9120) (out=2622) (deflated 71%)
  adding: src/pytorch_image_models/timm/data/distributed_sampler.py	(in=1955) (out=720) (deflated 63%)
  adding: src/pytorch_image_models/timm/data/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/data/__init__.py	(in=553) (out=226) (deflated 59%)
  adding: src/pytorch_image_models/timm/data/transforms_factory.py	(in=8262) (out=2049) (deflated 75%)
  adding: src/pytorch_image_models/timm/data/loader.py	(in=8732) (out=2441) (deflated 72%)
  adding: src/pytorch_image_models/timm/data/real_labels.py	(in=1590) (out=677) (deflated 57%)
  adding: src/pytorch_image_models/timm/data/random_erasing.py	(in=4512) (out=1620) (deflated 64%)
  adding: src/pytorch_image_models/timm/data/transforms.py	(in=5328) (out=1626) (deflated 69%)
  adding: src/pytorch_image_models/timm/data/auto_augment.py	(in=29504) (out=6595) (deflated 78%)
  adding: src/pytorch_image_models/timm/models/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/models/selecsls.py	(in=13100) (out=3099) (deflated 76%)
  adding: src/pytorch_image_models/timm/models/resnetv2.py	(in=23800) (out=5365) (deflated 77%)
  adding: src/pytorch_image_models/timm/models/efficientnet.py 	(in=71184) (out=8131) (deflated 89%)
  adding: src/pytorch_image_models/timm/models/inception_resnet_v2.py	(in=12318) (out=2211) (deflated 82%)
  adding: src/pytorch_image_models/timm/models/rexnet.py	(in=9972) (out=2841) (deflated 72%)
  adding: src/pytorch_image_models/timm/models/hrnet.py	(in=29301) (out=5030) (deflated 83%)
  adding: src/pytorch_image_models/timm/models/cspnet.py	(in=17904) (out=4226) (deflated 76%)
  adding: src/pytorch_image_models/timm/models/t2t_vit/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/models/t2t_vit/transformer_block.py	(in=3286) (out=1144) (deflated 65%)
  adding: src/pytorch_image_models/timm/models/t2t_vit/token_performer.py	(in=1129) (out=527) (deflated 53%)
  adding: src/pytorch_image_models/timm/models/t2t_vit/t2t_vit_dense.py	(in=6736) (out=2132) (deflated 68%)
  adding: src/pytorch_image_models/timm/models/t2t_vit/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/models/t2t_vit/t2t_vit.py	(in=12471) (out=2496) (deflated 80%)
  adding: src/pytorch_image_models/timm/models/t2t_vit/__init__.py	(in=310) (out=195) (deflated 37%)
  adding: src/pytorch_image_models/timm/models/t2t_vit/t2t_vit_ghost.py	(in=7737) (out=2148) (deflated 72%)
  adding: src/pytorch_image_models/timm/models/t2t_vit/t2t_vit_se.py	(in=6305) (out=2029) (deflated 68%)
  adding: src/pytorch_image_models/timm/models/t2t_vit/token_transformer.py	(in=2326) (out=917) (deflated 61%)
  adding: src/pytorch_image_models/timm/models/mobilenetv3.py	(in=17607) (out=3544) (deflated 80%)
  adding: src/pytorch_image_models/timm/models/resnest.py	(in=10194) (out=2170) (deflated 79%)
  adding: src/pytorch_image_models/timm/models/gluon_xception.py	(in=9530) (out=2523) (deflated 74%)
  adding: src/pytorch_image_models/timm/models/vision_transformer.py 	(in=67293) (out=11252) (deflated 83%)
  adding: src/pytorch_image_models/timm/models/efficientnet_builder.py	(in=17482) (out=4957) (deflated 72%)
  adding: src/pytorch_image_models/timm/models/nfnet.py	(in=20612) (out=5303) (deflated 74%)
  adding: src/pytorch_image_models/timm/models/layers/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/models/layers/config.py	(in=3069) (out=801) (deflated 74%)
  adding: src/pytorch_image_models/timm/models/layers/padding.py	(in=2167) (out=864) (deflated 60%)
  adding: src/pytorch_image_models/timm/models/layers/norm_act.py	(in=3542) (out=1147) (deflated 68%)
  adding: src/pytorch_image_models/timm/models/layers/inplace_abn.py	(in=3353) (out=1132) (deflated 66%)
  adding: src/pytorch_image_models/timm/models/layers/pool2d_same.py	(in=2969) (out=801) (deflated 73%)
  adding: src/pytorch_image_models/timm/models/layers/median_pool.py	(in=1737) (out=662) (deflated 62%)
  adding: src/pytorch_image_models/timm/models/layers/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/models/layers/mixed_conv2d.py	(in=1844) (out=786) (deflated 57%)
  adding: src/pytorch_image_models/timm/models/layers/cbam.py	(in=3337) (out=944) (deflated 72%)
  adding: src/pytorch_image_models/timm/models/layers/activations_me.py	(in=5886) (out=1439) (deflated 76%)
  adding: src/pytorch_image_models/timm/models/layers/__init__.py	(in=1767) (out=648) (deflated 63%)
  adding: src/pytorch_image_models/timm/models/layers/blur_pool.py	(in=2180) (out=966) (deflated 56%)
  adding: src/pytorch_image_models/timm/models/layers/std_conv.py	(in=3920) (out=971) (deflated 75%)
  adding: src/pytorch_image_models/timm/models/layers/split_batchnorm.py	(in=3441) (out=1216) (deflated 65%)
  adding: src/pytorch_image_models/timm/models/layers/test_time_pool.py	(in=1851) (out=708) (deflated 62%)
  adding: src/pytorch_image_models/timm/models/layers/activations.py	(in=4040) (out=1104) (deflated 73%)
  adding: src/pytorch_image_models/timm/models/layers/selective_kernel.py	(in=5282) (out=1716) (deflated 68%)
  adding: src/pytorch_image_models/timm/models/layers/create_conv2d.py	(in=1399) (out=591) (deflated 58%)
  adding: src/pytorch_image_models/timm/models/layers/split_attn.py	(in=3013) (out=1008) (deflated 67%)
  adding: src/pytorch_image_models/timm/models/layers/create_attn.py	(in=1418) (out=461) (deflated 67%)
  adding: src/pytorch_image_models/timm/models/layers/weight_init.py	(in=2359) (out=1003) (deflated 57%)
  adding: src/pytorch_image_models/timm/models/layers/evo_norm.py	(in=3328) (out=981) (deflated 71%)
  adding: src/pytorch_image_models/timm/models/layers/adaptive_avgmax_pool.py	(in=3903) (out=1008) (deflated 74%)
  adding: src/pytorch_image_models/timm/models/layers/eca.py	(in=4701) (out=1848) (deflated 61%)
  adding: src/pytorch_image_models/timm/models/layers/create_act.py	(in=3904) (out=1084) (deflated 72%)
  adding: src/pytorch_image_models/timm/models/layers/helpers.py	(in=738) (out=382) (deflated 48%)
  adding: src/pytorch_image_models/timm/models/layers/classifier.py	(in=2300) (out=784) (deflated 66%)
  adding: src/pytorch_image_models/timm/models/layers/create_norm_act.py	(in=3327) (out=1199) (deflated 64%)
  adding: src/pytorch_image_models/timm/models/layers/space_to_depth.py	(in=1750) (out=490) (deflated 72%)
  adding: src/pytorch_image_models/timm/models/layers/drop.py	(in=6938) (out=2062) (deflated 70%)
  adding: src/pytorch_image_models/timm/models/layers/activations_jit.py	(in=2529) (out=877) (deflated 65%)
  adding: src/pytorch_image_models/timm/models/layers/conv_bn_act.py	(in=1466) (out=585) (deflated 60%)
  adding: src/pytorch_image_models/timm/models/layers/linear.py	(in=743) (out=380) (deflated 49%)
  adding: src/pytorch_image_models/timm/models/layers/conv2d_same.py	(in=1490) (out=596) (deflated 60%)
  adding: src/pytorch_image_models/timm/models/layers/separable_conv.py	(in=2641) (out=717) (deflated 73%)
  adding: src/pytorch_image_models/timm/models/layers/anti_aliasing.py	(in=2293) (out=708) (deflated 69%)
  adding: src/pytorch_image_models/timm/models/layers/cond_conv2d.py	(in=5129) (out=1597) (deflated 69%)
  adding: src/pytorch_image_models/timm/models/layers/se.py	(in=2294) (out=761) (deflated 67%)
  adding: src/pytorch_image_models/timm/models/senet.py	(in=17637) (out=3473) (deflated 80%)
  adding: src/pytorch_image_models/timm/models/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/models/resnet.py	(in=58768) (out=8319) (deflated 86%)
  adding: src/pytorch_image_models/timm/models/dpn.py	(in=12328) (out=2859) (deflated 77%)
  adding: src/pytorch_image_models/timm/models/__init__.py	(in=1064) (out=310) (deflated 71%)
  adding: src/pytorch_image_models/timm/models/dla.py	(in=17155) (out=3586) (deflated 79%)
  adding: src/pytorch_image_models/timm/models/factory.py	(in=2768) (out=1100) (deflated 60%)
  adding: src/pytorch_image_models/timm/models/hub.py	(in=3409) (out=1326) (deflated 61%)
  adding: src/pytorch_image_models/timm/models/densenet.py	(in=15595) (out=3709) (deflated 76%)
  adding: src/pytorch_image_models/timm/models/efficientnet_blocks.py	(in=14680) (out=3084) (deflated 79%)
  adding: src/pytorch_image_models/timm/models/res2net.py	(in=7849) (out=1919) (deflated 76%)
  adding: src/pytorch_image_models/timm/models/regnet.py	(in=20529) (out=4873) (deflated 76%)
  adding: src/pytorch_image_models/timm/models/gluon_resnet.py	(in=11348) (out=1489) (deflated 87%)
  adding: src/pytorch_image_models/timm/models/helpers.py	(in=20989) (out=5413) (deflated 74%)
  adding: src/pytorch_image_models/timm/models/pnasnet.py	(in=14839) (out=2860) (deflated 81%)
  adding: src/pytorch_image_models/timm/models/nasnet.py	(in=25683) (out=3028) (deflated 88%)
  adding: src/pytorch_image_models/timm/models/vovnet.py	(in=13821) (out=3150) (deflated 77%)
  adding: src/pytorch_image_models/timm/models/xception_aligned.py	(in=9266) (out=2287) (deflated 75%)
  adding: src/pytorch_image_models/timm/models/pruned/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/models/pruned/ecaresnet101d_pruned.txt	(in=8734) (out=1311) (deflated 85%)
  adding: src/pytorch_image_models/timm/models/pruned/efficientnet_b2_pruned.txt	(in=18676) (out=2208) (deflated 88%)
  adding: src/pytorch_image_models/timm/models/pruned/efficientnet_b3_pruned.txt	(in=21133) (out=2476) (deflated 88%)
  adding: src/pytorch_image_models/timm/models/pruned/ecaresnet50d_pruned.txt	(in=4520) (out=756) (deflated 83%)
  adding: src/pytorch_image_models/timm/models/pruned/efficientnet_b1_pruned.txt	(in=18596) (out=2208) (deflated 88%)
  adding: src/pytorch_image_models/timm/models/sknet.py	(in=8709) (out=1966) (deflated 77%)
  adding: src/pytorch_image_models/timm/models/features.py	(in=12155) (out=3574) (deflated 71%)
  adding: src/pytorch_image_models/timm/models/registry.py	(in=3970) (out=1351) (deflated 66%)
  adding: src/pytorch_image_models/timm/models/xception.py	(in=7372) (out=2142) (deflated 71%)
  adding: src/pytorch_image_models/timm/models/inception_v3.py	(in=17431) (out=3180) (deflated 82%)
  adding: src/pytorch_image_models/timm/models/tresnet.py	(in=11433) (out=2709) (deflated 76%)
  adding: src/pytorch_image_models/timm/models/inception_v4.py	(in=10723) (out=1913) (deflated 82%)
  adding: src/pytorch_image_models/timm/version.py	(in=22) (out=22) (stored 0%)
  adding: src/pytorch_image_models/timm/utils/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/utils/summary.py	(in=1074) (out=478) (deflated 55%)
  adding: src/pytorch_image_models/timm/utils/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: src/pytorch_image_models/timm/utils/__init__.py	(in=459) (out=242) (deflated 47%)
  adding: src/pytorch_image_models/timm/utils/misc.py	(in=644) (out=375) (deflated 42%)
  adding: src/pytorch_image_models/timm/utils/jit.py	(in=648) (out=361) (deflated 44%)
  adding: src/pytorch_image_models/timm/utils/model_ema.py	(in=5670) (out=1712) (deflated 70%)
  adding: src/pytorch_image_models/timm/utils/distributed.py	(in=896) (out=416) (deflated 54%)
  adding: src/pytorch_image_models/timm/utils/cuda.py	(in=1616) (out=529) (deflated 67%)
  adding: src/pytorch_image_models/timm/utils/model.py	(in=389) (out=212) (deflated 46%)
  adding: src/pytorch_image_models/timm/utils/metrics.py	(in=867) (out=426) (deflated 51%)
  adding: src/pytorch_image_models/timm/utils/checkpoint_saver.py	(in=6133) (out=1649) (deflated 73%)
  adding: src/pytorch_image_models/timm/utils/log.py	(in=1015) (out=429) (deflated 58%)
  adding: stats.pdf	(in=31668) (out=26372) (deflated 17%)
  adding: tools/	(in=0) (out=0) (stored 0%)
  adding: tools/azureml/	(in=0) (out=0) (stored 0%)
  adding: tools/azureml/workspace_utils.py	(in=4563) (out=1197) (deflated 74%)
  adding: tools/azureml/aml_main.py	(in=7466) (out=2398) (deflated 68%)
  adding: tools/azureml/aml_job.py	(in=6141) (out=1984) (deflated 68%)
  adding: tools/azureml/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: tools/azureml/misc.py	(in=404) (out=232) (deflated 43%)
  adding: tools/azureml/README.md	(in=4733) (out=1947) (deflated 59%)
  adding: tools/azureml/aml_submit.py	(in=7630) (out=2642) (deflated 65%)
  adding: tools/azureml/aml_job_config.json	(in=4437) (out=1844) (deflated 58%)
  adding: tools/common_utils/	(in=0) (out=0) (stored 0%)
  adding: tools/common_utils/RandAugment.py	(in=13823) (out=3960) (deflated 71%)
  adding: tools/common_utils/azure_storage_io.py	(in=6058) (out=1427) (deflated 76%)
  adding: tools/common_utils/utils.py	(in=12776) (out=2937) (deflated 77%)
  adding: tools/common_utils/__pycache__/	(in=0) (out=0) (stored 0%)
  adding: tools/common_utils/misc.py	(in=6657) (out=1844) (deflated 72%)
  adding: tools/common_utils/ignore_file.py	(in=2631) (out=745) (deflated 72%)
  adding: tools/common_utils/RandomCrop.py	(in=38062) (out=4655) (deflated 88%)
  adding: vinvl_label.json 	(in=70538) (out=24658) (deflated 65%)
  adding: visualize.py	(in=3403) (out=1336) (deflated 61%)
total bytes=55353896, compressed=15757117 -> 72% savings
2022-03-17 13:32:23,076.076 2829:qd_pytorch.py:1420 load_latest_parameters(): using output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/parameters_2022_03_16_04_43_35.yaml
2022-03-17 13:32:23,368.368 2829:uni_pipeline.py:841 _ensure_initialized(): initialized
2022-03-17 13:32:23,720.720 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json
2022-03-17 13:32:23,720.720 2829:modeling_utils.py:211 from_pretrained(): Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "image_captioning",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "TIMM_vit",
  "net": "vit_base_patch16_384",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pretrained": true,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-03-17 13:32:23,722.722 2829:tokenization_utils.py:170 _from_pretrained(): Model name './aux_data/untrained_config/VILT-L12-H784-uncased_16_384' not found in model shortcut name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc). Assuming './aux_data/untrained_config/VILT-L12-H784-uncased_16_384' is a path or url to a directory containing tokenizer files.
2022-03-17 13:32:23,723.723 2829:tokenization_utils.py:180 _from_pretrained(): Didn't find file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/added_tokens.json. We won't load it.
2022-03-17 13:32:23,723.723 2829:tokenization_utils.py:180 _from_pretrained(): Didn't find file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/special_tokens_map.json. We won't load it.
2022-03-17 13:32:23,723.723 2829:tokenization_utils.py:214 _from_pretrained(): loading file None
2022-03-17 13:32:23,723.723 2829:tokenization_utils.py:214 _from_pretrained(): loading file None
2022-03-17 13:32:23,723.723 2829:tokenization_utils.py:214 _from_pretrained(): loading file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/vocab.txt
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 628, in pipeline_train_eval_multi
    pip.ensure_predict()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 628, in pipeline_train_eval_multi
    pip.ensure_predict()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
2022-03-17 13:32:24,439.439 2829:modeling_bert.py:529   __init__(): TIMM Split image encoder load from pre-trained: True
2022-03-17 13:32:25,514.514 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:32:27,035.035 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:32:27,691.691 2829:modeling_bert.py:2677   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:32:27,840.840 2829:modeling_bert.py:2688   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:32:28,631.631 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight!  Note that this might be replaced by pre-trained checkpoint later!
2022-03-17 13:32:28,632.632 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode.
2022-03-17 13:32:29,736.736 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:32:30,268.268 2829:checkpoint.py:240       load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt
2022-03-17 13:32:38,771.771 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token                                loaded from image_encoder.module.cls_token                         of shape (1, 1, 768)
2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias                                loaded from image_encoder.module.head.bias                         of shape (1000,)
2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight                              loaded from image_encoder.module.head.weight                       of shape (1000, 768)
2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias                    loaded from image_encoder.module.patch_embed.proj.bias             of shape (768,)
2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight                  loaded from image_encoder.module.patch_embed.proj.weight           of shape (768, 3, 16, 16)
2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed                                loaded from image_encoder.module.pos_embed                         of shape (1, 577, 768)
2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias                         loaded from bert.caption_pooler.dense.bias                         of shape (768,)
2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight                       loaded from bert.caption_pooler.dense.weight                       of shape (768, 768)
2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias       loaded from bert.decoder.layer.0.attention.output.dense.bias       of shape (768,)
2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight     loaded from bert.decoder.layer.0.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias           loaded from bert.decoder.layer.0.attention.self.key.bias           of shape (768,)
2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight         loaded from bert.decoder.layer.0.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias         loaded from bert.decoder.layer.0.attention.self.query.bias         of shape (768,)
2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight       loaded from bert.decoder.layer.0.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias         loaded from bert.decoder.layer.0.attention.self.value.bias         of shape (768,)
2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight       loaded from bert.decoder.layer.0.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias           loaded from bert.decoder.layer.0.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight         loaded from bert.decoder.layer.0.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias             loaded from bert.decoder.layer.0.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight           loaded from bert.decoder.layer.0.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias                 loaded from bert.decoder.layer.0.output.dense.bias                 of shape (768,)
2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight               loaded from bert.decoder.layer.0.output.dense.weight               of shape (768, 3072)
2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias       loaded from bert.decoder.layer.1.attention.output.dense.bias       of shape (768,)
2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight     loaded from bert.decoder.layer.1.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias           loaded from bert.decoder.layer.1.attention.self.key.bias           of shape (768,)
2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight         loaded from bert.decoder.layer.1.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias         loaded from bert.decoder.layer.1.attention.self.query.bias         of shape (768,)
2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight       loaded from bert.decoder.layer.1.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias         loaded from bert.decoder.layer.1.attention.self.value.bias         of shape (768,)
2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight       loaded from bert.decoder.layer.1.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias           loaded from bert.decoder.layer.1.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight         loaded from bert.decoder.layer.1.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias             loaded from bert.decoder.layer.1.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight           loaded from bert.decoder.layer.1.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias                 loaded from bert.decoder.layer.1.output.dense.bias                 of shape (768,)
2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight               loaded from bert.decoder.layer.1.output.dense.weight               of shape (768, 3072)
2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias       loaded from bert.decoder.layer.2.attention.output.dense.bias       of shape (768,)
2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight     loaded from bert.decoder.layer.2.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias           loaded from bert.decoder.layer.2.attention.self.key.bias           of shape (768,)
2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight         loaded from bert.decoder.layer.2.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias         loaded from bert.decoder.layer.2.attention.self.query.bias         of shape (768,)
2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight       loaded from bert.decoder.layer.2.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias         loaded from bert.decoder.layer.2.attention.self.value.bias         of shape (768,)
2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight       loaded from bert.decoder.layer.2.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias           loaded from bert.decoder.layer.2.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight         loaded from bert.decoder.layer.2.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias             loaded from bert.decoder.layer.2.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight           loaded from bert.decoder.layer.2.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias                 loaded from bert.decoder.layer.2.output.dense.bias                 of shape (768,)
2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight               loaded from bert.decoder.layer.2.output.dense.weight               of shape (768, 3072)
2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias       loaded from bert.decoder.layer.3.attention.output.dense.bias       of shape (768,)
2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight     loaded from bert.decoder.layer.3.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias           loaded from bert.decoder.layer.3.attention.self.key.bias           of shape (768,)
2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight         loaded from bert.decoder.layer.3.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias         loaded from bert.decoder.layer.3.attention.self.query.bias         of shape (768,)
2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight       loaded from bert.decoder.layer.3.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias         loaded from bert.decoder.layer.3.attention.self.value.bias         of shape (768,)
2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight       loaded from bert.decoder.layer.3.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias           loaded from bert.decoder.layer.3.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight         loaded from bert.decoder.layer.3.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias             loaded from bert.decoder.layer.3.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight           loaded from bert.decoder.layer.3.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias                 loaded from bert.decoder.layer.3.output.dense.bias                 of shape (768,)
2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight               loaded from bert.decoder.layer.3.output.dense.weight               of shape (768, 3072)
2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias                         loaded from bert.embeddings.LayerNorm.bias                         of shape (768,)
2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight                       loaded from bert.embeddings.LayerNorm.weight                       of shape (768,)
2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight             loaded from bert.embeddings.position_embeddings.weight             of shape (512, 768)
2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight           loaded from bert.embeddings.token_type_embeddings.weight           of shape (2, 768)
2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight                 loaded from bert.embeddings.word_embeddings.weight                 of shape (30522, 768)
2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias                   loaded from bert.encoder.blocks.0.attn.proj.bias                   of shape (768,)
2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight                 loaded from bert.encoder.blocks.0.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias                    loaded from bert.encoder.blocks.0.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight                  loaded from bert.encoder.blocks.0.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias                     loaded from bert.encoder.blocks.0.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight                   loaded from bert.encoder.blocks.0.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias                     loaded from bert.encoder.blocks.0.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight                   loaded from bert.encoder.blocks.0.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias                       loaded from bert.encoder.blocks.0.norm1.bias                       of shape (768,)
2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight                     loaded from bert.encoder.blocks.0.norm1.weight                     of shape (768,)
2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias                       loaded from bert.encoder.blocks.0.norm2.bias                       of shape (768,)
2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight                     loaded from bert.encoder.blocks.0.norm2.weight                     of shape (768,)
2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias                   loaded from bert.encoder.blocks.1.attn.proj.bias                   of shape (768,)
2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight                 loaded from bert.encoder.blocks.1.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias                    loaded from bert.encoder.blocks.1.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight                  loaded from bert.encoder.blocks.1.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias                     loaded from bert.encoder.blocks.1.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight                   loaded from bert.encoder.blocks.1.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias                     loaded from bert.encoder.blocks.1.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight                   loaded from bert.encoder.blocks.1.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias                       loaded from bert.encoder.blocks.1.norm1.bias                       of shape (768,)
2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight                     loaded from bert.encoder.blocks.1.norm1.weight                     of shape (768,)
2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias                       loaded from bert.encoder.blocks.1.norm2.bias                       of shape (768,)
2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight                     loaded from bert.encoder.blocks.1.norm2.weight                     of shape (768,)
2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias                  loaded from bert.encoder.blocks.10.attn.proj.bias                  of shape (768,)
2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight                loaded from bert.encoder.blocks.10.attn.proj.weight                of shape (768, 768)
2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias                   loaded from bert.encoder.blocks.10.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight                 loaded from bert.encoder.blocks.10.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias                    loaded from bert.encoder.blocks.10.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight                  loaded from bert.encoder.blocks.10.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias                    loaded from bert.encoder.blocks.10.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight                  loaded from bert.encoder.blocks.10.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias                      loaded from bert.encoder.blocks.10.norm1.bias                      of shape (768,)
2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight                    loaded from bert.encoder.blocks.10.norm1.weight                    of shape (768,)
2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias                      loaded from bert.encoder.blocks.10.norm2.bias                      of shape (768,)
2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight                    loaded from bert.encoder.blocks.10.norm2.weight                    of shape (768,)
2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias                  loaded from bert.encoder.blocks.11.attn.proj.bias                  of shape (768,)
2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight                loaded from bert.encoder.blocks.11.attn.proj.weight                of shape (768, 768)
2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias                   loaded from bert.encoder.blocks.11.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight                 loaded from bert.encoder.blocks.11.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias                    loaded from bert.encoder.blocks.11.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight                  loaded from bert.encoder.blocks.11.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias                    loaded from bert.encoder.blocks.11.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight                  loaded from bert.encoder.blocks.11.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias                      loaded from bert.encoder.blocks.11.norm1.bias                      of shape (768,)
2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight                    loaded from bert.encoder.blocks.11.norm1.weight                    of shape (768,)
2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias                      loaded from bert.encoder.blocks.11.norm2.bias                      of shape (768,)
2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight                    loaded from bert.encoder.blocks.11.norm2.weight                    of shape (768,)
2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias                   loaded from bert.encoder.blocks.2.attn.proj.bias                   of shape (768,)
2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight                 loaded from bert.encoder.blocks.2.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias                    loaded from bert.encoder.blocks.2.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight                  loaded from bert.encoder.blocks.2.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias                     loaded from bert.encoder.blocks.2.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight                   loaded from bert.encoder.blocks.2.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias                     loaded from bert.encoder.blocks.2.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight                   loaded from bert.encoder.blocks.2.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias                       loaded from bert.encoder.blocks.2.norm1.bias                       of shape (768,)
2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight                     loaded from bert.encoder.blocks.2.norm1.weight                     of shape (768,)
2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias                       loaded from bert.encoder.blocks.2.norm2.bias                       of shape (768,)
2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight                     loaded from bert.encoder.blocks.2.norm2.weight                     of shape (768,)
2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias                   loaded from bert.encoder.blocks.3.attn.proj.bias                   of shape (768,)
2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight                 loaded from bert.encoder.blocks.3.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias                    loaded from bert.encoder.blocks.3.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight                  loaded from bert.encoder.blocks.3.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias                     loaded from bert.encoder.blocks.3.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight                   loaded from bert.encoder.blocks.3.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias                     loaded from bert.encoder.blocks.3.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight                   loaded from bert.encoder.blocks.3.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias                       loaded from bert.encoder.blocks.3.norm1.bias                       of shape (768,)
2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight                     loaded from bert.encoder.blocks.3.norm1.weight                     of shape (768,)
2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias                       loaded from bert.encoder.blocks.3.norm2.bias                       of shape (768,)
2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight                     loaded from bert.encoder.blocks.3.norm2.weight                     of shape (768,)
2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias                   loaded from bert.encoder.blocks.4.attn.proj.bias                   of shape (768,)
2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight                 loaded from bert.encoder.blocks.4.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias                    loaded from bert.encoder.blocks.4.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight                  loaded from bert.encoder.blocks.4.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias                     loaded from bert.encoder.blocks.4.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight                   loaded from bert.encoder.blocks.4.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias                     loaded from bert.encoder.blocks.4.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight                   loaded from bert.encoder.blocks.4.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias                       loaded from bert.encoder.blocks.4.norm1.bias                       of shape (768,)
2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight                     loaded from bert.encoder.blocks.4.norm1.weight                     of shape (768,)
2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias                       loaded from bert.encoder.blocks.4.norm2.bias                       of shape (768,)
2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight                     loaded from bert.encoder.blocks.4.norm2.weight                     of shape (768,)
2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias                   loaded from bert.encoder.blocks.5.attn.proj.bias                   of shape (768,)
2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight                 loaded from bert.encoder.blocks.5.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias                    loaded from bert.encoder.blocks.5.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight                  loaded from bert.encoder.blocks.5.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias                     loaded from bert.encoder.blocks.5.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight                   loaded from bert.encoder.blocks.5.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias                     loaded from bert.encoder.blocks.5.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight                   loaded from bert.encoder.blocks.5.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias                       loaded from bert.encoder.blocks.5.norm1.bias                       of shape (768,)
2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight                     loaded from bert.encoder.blocks.5.norm1.weight                     of shape (768,)
2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias                       loaded from bert.encoder.blocks.5.norm2.bias                       of shape (768,)
2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight                     loaded from bert.encoder.blocks.5.norm2.weight                     of shape (768,)
2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias                   loaded from bert.encoder.blocks.6.attn.proj.bias                   of shape (768,)
2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight                 loaded from bert.encoder.blocks.6.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias                    loaded from bert.encoder.blocks.6.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight                  loaded from bert.encoder.blocks.6.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias                     loaded from bert.encoder.blocks.6.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight                   loaded from bert.encoder.blocks.6.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias                     loaded from bert.encoder.blocks.6.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight                   loaded from bert.encoder.blocks.6.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias                       loaded from bert.encoder.blocks.6.norm1.bias                       of shape (768,)
2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight                     loaded from bert.encoder.blocks.6.norm1.weight                     of shape (768,)
2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias                       loaded from bert.encoder.blocks.6.norm2.bias                       of shape (768,)
2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight                     loaded from bert.encoder.blocks.6.norm2.weight                     of shape (768,)
2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias                   loaded from bert.encoder.blocks.7.attn.proj.bias                   of shape (768,)
2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight                 loaded from bert.encoder.blocks.7.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias                    loaded from bert.encoder.blocks.7.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight                  loaded from bert.encoder.blocks.7.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias                     loaded from bert.encoder.blocks.7.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight                   loaded from bert.encoder.blocks.7.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias                     loaded from bert.encoder.blocks.7.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight                   loaded from bert.encoder.blocks.7.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias                       loaded from bert.encoder.blocks.7.norm1.bias                       of shape (768,)
2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight                     loaded from bert.encoder.blocks.7.norm1.weight                     of shape (768,)
2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias                       loaded from bert.encoder.blocks.7.norm2.bias                       of shape (768,)
2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight                     loaded from bert.encoder.blocks.7.norm2.weight                     of shape (768,)
2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias                   loaded from bert.encoder.blocks.8.attn.proj.bias                   of shape (768,)
2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight                 loaded from bert.encoder.blocks.8.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias                    loaded from bert.encoder.blocks.8.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight                  loaded from bert.encoder.blocks.8.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias                     loaded from bert.encoder.blocks.8.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight                   loaded from bert.encoder.blocks.8.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias                     loaded from bert.encoder.blocks.8.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight                   loaded from bert.encoder.blocks.8.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias                       loaded from bert.encoder.blocks.8.norm1.bias                       of shape (768,)
2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight                     loaded from bert.encoder.blocks.8.norm1.weight                     of shape (768,)
2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias                       loaded from bert.encoder.blocks.8.norm2.bias                       of shape (768,)
2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight                     loaded from bert.encoder.blocks.8.norm2.weight                     of shape (768,)
2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias                   loaded from bert.encoder.blocks.9.attn.proj.bias                   of shape (768,)
2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight                 loaded from bert.encoder.blocks.9.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias                    loaded from bert.encoder.blocks.9.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight                  loaded from bert.encoder.blocks.9.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias                     loaded from bert.encoder.blocks.9.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight                   loaded from bert.encoder.blocks.9.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias                     loaded from bert.encoder.blocks.9.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight                   loaded from bert.encoder.blocks.9.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias                       loaded from bert.encoder.blocks.9.norm1.bias                       of shape (768,)
2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight                     loaded from bert.encoder.blocks.9.norm1.weight                     of shape (768,)
2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias                       loaded from bert.encoder.blocks.9.norm2.bias                       of shape (768,)
2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight                     loaded from bert.encoder.blocks.9.norm2.weight                     of shape (768,)
2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias               loaded from bert.encoder.tag_blocks.0.attn.proj.bias               of shape (768,)
2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight             loaded from bert.encoder.tag_blocks.0.attn.proj.weight             of shape (768, 768)
2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias                loaded from bert.encoder.tag_blocks.0.attn.qkv.bias                of shape (2304,)
2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight              loaded from bert.encoder.tag_blocks.0.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias                   loaded from bert.encoder.tag_blocks.0.norm1.bias                   of shape (768,)
2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight                 loaded from bert.encoder.tag_blocks.0.norm1.weight                 of shape (768,)
2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias                   loaded from bert.encoder.tag_blocks.0.norm2.bias                   of shape (768,)
2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight                 loaded from bert.encoder.tag_blocks.0.norm2.weight                 of shape (768,)
2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias               loaded from bert.encoder.tag_blocks.1.attn.proj.bias               of shape (768,)
2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight             loaded from bert.encoder.tag_blocks.1.attn.proj.weight             of shape (768, 768)
2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias                loaded from bert.encoder.tag_blocks.1.attn.qkv.bias                of shape (2304,)
2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight              loaded from bert.encoder.tag_blocks.1.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias                   loaded from bert.encoder.tag_blocks.1.norm1.bias                   of shape (768,)
2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight                 loaded from bert.encoder.tag_blocks.1.norm1.weight                 of shape (768,)
2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias                   loaded from bert.encoder.tag_blocks.1.norm2.bias                   of shape (768,)
2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight                 loaded from bert.encoder.tag_blocks.1.norm2.weight                 of shape (768,)
2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias               loaded from bert.encoder.tag_blocks.2.attn.proj.bias               of shape (768,)
2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight             loaded from bert.encoder.tag_blocks.2.attn.proj.weight             of shape (768, 768)
2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias                loaded from bert.encoder.tag_blocks.2.attn.qkv.bias                of shape (2304,)
2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight              loaded from bert.encoder.tag_blocks.2.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias                   loaded from bert.encoder.tag_blocks.2.norm1.bias                   of shape (768,)
2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight                 loaded from bert.encoder.tag_blocks.2.norm1.weight                 of shape (768,)
2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias                   loaded from bert.encoder.tag_blocks.2.norm2.bias                   of shape (768,)
2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight                 loaded from bert.encoder.tag_blocks.2.norm2.weight                 of shape (768,)
2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias               loaded from bert.encoder.tag_blocks.3.attn.proj.bias               of shape (768,)
2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight             loaded from bert.encoder.tag_blocks.3.attn.proj.weight             of shape (768, 768)
2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias                loaded from bert.encoder.tag_blocks.3.attn.qkv.bias                of shape (2304,)
2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight              loaded from bert.encoder.tag_blocks.3.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias                   loaded from bert.encoder.tag_blocks.3.norm1.bias                   of shape (768,)
2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight                 loaded from bert.encoder.tag_blocks.3.norm1.weight                 of shape (768,)
2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias                   loaded from bert.encoder.tag_blocks.3.norm2.bias                   of shape (768,)
2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight                 loaded from bert.encoder.tag_blocks.3.norm2.weight                 of shape (768,)
2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias                   loaded from bert.extra_embeddings.LayerNorm.bias                   of shape (768,)
2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight                 loaded from bert.extra_embeddings.LayerNorm.weight                 of shape (768,)
2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight       loaded from bert.extra_embeddings.position_embeddings.weight       of shape (512, 768)
2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight     loaded from bert.extra_embeddings.token_type_embeddings.weight     of shape (2, 768)
2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight           loaded from bert.extra_embeddings.word_embeddings.weight           of shape (30522, 768)
2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias                                 loaded from bert.pooler.dense.bias                                 of shape (768,)
2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight                               loaded from bert.pooler.dense.weight                               of shape (768, 768)
2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias                        loaded from bert.tag_logit.predictions.bias                        of shape (30522,)
2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight              loaded from bert.tag_logit.predictions.decoder.weight              of shape (30522, 768)
2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias    loaded from bert.tag_logit.predictions.transform.LayerNorm.bias    of shape (768,)
2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight  loaded from bert.tag_logit.predictions.transform.LayerNorm.weight  of shape (768,)
2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias        loaded from bert.tag_logit.predictions.transform.dense.bias        of shape (768,)
2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight      loaded from bert.tag_logit.predictions.transform.dense.weight      of shape (768, 768)
2022-03-17 13:32:38,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias                                   loaded from cls.predictions.bias                                   of shape (30522,)
2022-03-17 13:32:38,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight                         loaded from cls.predictions.decoder.weight                         of shape (30522, 768)
2022-03-17 13:32:38,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias               loaded from cls.predictions.transform.LayerNorm.bias               of shape (768,)
2022-03-17 13:32:38,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight             loaded from cls.predictions.transform.LayerNorm.weight             of shape (768,)
2022-03-17 13:32:38,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias                   loaded from cls.predictions.transform.dense.bias                   of shape (768,)
2022-03-17 13:32:38,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight                 loaded from cls.predictions.transform.dense.weight                 of shape (768, 768)
2022-03-17 13:32:38,794.794 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288
2022-03-17 13:32:38,794.794 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = []
2022-03-17 13:32:38,797.797 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1
2022-03-17 13:32:38,959.959 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight']
2022-03-17 13:32:39,268.268 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = <src.qd.data_layer.transform.LoadImage object at 0x7fcbcd88d1d0>; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False
2022-03-17 13:32:39,294.294 2829:uni_pipeline.py:509 get_data_loader(): sampler = <src.qd.data_layer.samplers.DistributedSampler object at 0x7fcba4dd2c90>
2022-03-17 13:32:39,294.294 2829:uni_pipeline.py:1000    predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:32:39,367.367 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=<src.qd.data_layer.dataset.ImageIdxTSVDataset object at 0x7fcbcd89cdd0>, transform=Compose(
    <src.qd.data_layer.transform.LoadHW object at 0x7fcbcd88d250>
    Compose(
    <src.qd.data_layer.transform.LoadImage object at 0x7fcbcd88d1d0>
    ImageTransform2Dict(image_transform=Compose(
    <src.qd.data_layer.transform.BGR2RGB object at 0x7fcba55a7090>
    ToPILImage()
    Resize(size=384, interpolation=PIL.Image.BICUBIC)
    CenterCrop(size=(384, 384))
    ToTensor()
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
))
)
    LoadLabel(data=TaxCocoCaption, split=test, version=vinvl)
    <src.qd.data_layer.transform.IdentifyTextAB object at 0x7fcba55a7990>
    TransCaptionTensorizer(tensorizer=<src.qd.mask.data.datasets.caption_tensorizer.CaptionTensorizer object at 0x7fcba55a7910>, pad_to_max=True, pad_image_to_max=True)
    <src.qd.pipelines.tagger_caption_uni_pipeline_expanding.Tensorizer object at 0x7fcba55a7a90>
    <src.qd.data_layer.transform.RemoveUselessKeys object at 0x7fcba55a79d0>
    <src.qd.data_layer.transform.RenameKey object at 0x7fcba4dd2bd0>
))

uni_pipeline.py:908:   0%|          | 0/10 [00:00<?, ?it/s]
uni_pipeline.py:908:  10%|█         | 1/10 [00:05<00:53,  5.95s/it]
uni_pipeline.py:908:  20%|██        | 2/10 [00:11<00:43,  5.50s/it]
uni_pipeline.py:908:  30%|███       | 3/10 [00:16<00:38,  5.51s/it]
uni_pipeline.py:908:  40%|████      | 4/10 [00:21<00:32,  5.38s/it]
uni_pipeline.py:908:  50%|█████     | 5/10 [00:26<00:25,  5.06s/it]
uni_pipeline.py:908:  60%|██████    | 6/10 [00:31<00:20,  5.11s/it]
uni_pipeline.py:908:  70%|███████   | 7/10 [00:36<00:15,  5.13s/it]
uni_pipeline.py:908:  80%|████████  | 8/10 [00:41<00:10,  5.04s/it]
uni_pipeline.py:908:  90%|█████████ | 9/10 [00:47<00:05,  5.20s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:51<00:00,  4.97s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:52<00:00,  5.23s/it]
2022-03-17 13:33:33,223.223 2829:qd_common.py:625    cmd_run(): start to cmd run: nvidia-smi
2022-03-17 13:33:35,661.661 2829:qd_common.py:670    cmd_run(): finished the cmd run
2022-03-17 13:33:36,293.293 2829:uni_pipeline.py:1008    predict(): data: 0.0003 (0.0622)  input_to_cuda: 0.0029 (0.0030)  model: 5.1766 (5.1069)  write: 0.0044 (0.0044)
2022-03-17 13:33:56,911.911 2829:qd_common.py:1157 concat_files(): concating 0/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:33:56,912.912 2829:qd_common.py:1157 concat_files(): concating 1/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_1_32.tsv
2022-03-17 13:33:56,913.913 2829:qd_common.py:1157 concat_files(): concating 2/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_2_32.tsv
2022-03-17 13:33:56,913.913 2829:qd_common.py:1157 concat_files(): concating 3/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_3_32.tsv
2022-03-17 13:33:56,914.914 2829:qd_common.py:1157 concat_files(): concating 4/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_4_32.tsv
2022-03-17 13:33:56,914.914 2829:qd_common.py:1157 concat_files(): concating 5/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_5_32.tsv
2022-03-17 13:33:56,915.915 2829:qd_common.py:1157 concat_files(): concating 6/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_6_32.tsv
2022-03-17 13:33:56,915.915 2829:qd_common.py:1157 concat_files(): concating 7/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_7_32.tsv
2022-03-17 13:33:56,916.916 2829:qd_common.py:1157 concat_files(): concating 8/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_8_32.tsv
2022-03-17 13:33:56,986.986 2829:qd_common.py:1157 concat_files(): concating 9/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_9_32.tsv
2022-03-17 13:33:57,058.058 2829:qd_common.py:1157 concat_files(): concating 10/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_10_32.tsv
2022-03-17 13:33:57,131.131 2829:qd_common.py:1157 concat_files(): concating 11/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_11_32.tsv
2022-03-17 13:33:57,203.203 2829:qd_common.py:1157 concat_files(): concating 12/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_12_32.tsv
2022-03-17 13:33:57,278.278 2829:qd_common.py:1157 concat_files(): concating 13/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_13_32.tsv
2022-03-17 13:33:57,350.350 2829:qd_common.py:1157 concat_files(): concating 14/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_14_32.tsv
2022-03-17 13:33:57,420.420 2829:qd_common.py:1157 concat_files(): concating 15/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_15_32.tsv
2022-03-17 13:33:57,492.492 2829:qd_common.py:1157 concat_files(): concating 16/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_16_32.tsv
2022-03-17 13:33:57,564.564 2829:qd_common.py:1157 concat_files(): concating 17/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_17_32.tsv
2022-03-17 13:33:57,639.639 2829:qd_common.py:1157 concat_files(): concating 18/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_18_32.tsv
2022-03-17 13:33:57,711.711 2829:qd_common.py:1157 concat_files(): concating 19/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_19_32.tsv
2022-03-17 13:33:57,781.781 2829:qd_common.py:1157 concat_files(): concating 20/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_20_32.tsv
2022-03-17 13:33:57,852.852 2829:qd_common.py:1157 concat_files(): concating 21/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_21_32.tsv
2022-03-17 13:33:57,925.925 2829:qd_common.py:1157 concat_files(): concating 22/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_22_32.tsv
2022-03-17 13:33:57,997.997 2829:qd_common.py:1157 concat_files(): concating 23/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_23_32.tsv
2022-03-17 13:33:58,070.070 2829:qd_common.py:1157 concat_files(): concating 24/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_24_32.tsv
2022-03-17 13:33:58,141.141 2829:qd_common.py:1157 concat_files(): concating 25/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_25_32.tsv
2022-03-17 13:33:58,212.212 2829:qd_common.py:1157 concat_files(): concating 26/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_26_32.tsv
2022-03-17 13:33:58,283.283 2829:qd_common.py:1157 concat_files(): concating 27/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_27_32.tsv
2022-03-17 13:33:58,354.354 2829:qd_common.py:1157 concat_files(): concating 28/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_28_32.tsv
2022-03-17 13:33:58,426.426 2829:qd_common.py:1157 concat_files(): concating 29/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_29_32.tsv
2022-03-17 13:33:58,498.498 2829:qd_common.py:1157 concat_files(): concating 30/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_30_32.tsv
2022-03-17 13:33:58,569.569 2829:qd_common.py:1157 concat_files(): concating 31/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_31_32.tsv
2022-03-17 13:34:00,840.840 2829:tsv_io.py:66 reorder_tsv_keys(): loading keys in input
2022-03-17 13:34:00,866.866 2829:tsv_io.py:328 _ensure_lineidx_loaded(): loaded 5024 from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv.before.reorder.lineidx

tsv_io.py:67:   0%|          | 0/5024 [00:00<?, ?it/s]
tsv_io.py:67: 100%|██████████| 5024/5024 [00:00<00:00, 104118.78it/s]
2022-03-17 13:34:00,988.988 2829:tsv_io.py:70   gen_rows(): writing

tsv_io.py:71:   0%|          | 0/5000 [00:00<?, ?it/s]
tsv_io.py:71: 100%|██████████| 5000/5000 [00:00<00:00, 73810.89it/s]
2022-03-17 13:34:18,001.001 2829:qd_common.py:625    cmd_run(): start to cmd run: rsync output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv.speed.vis.txt.tmp --progress
model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt

         32,768  33%    0.00kB/s    0:00:00  
         98,002 100%   62.21MB/s    0:00:00 (xfr#1, to-chk=0/1)
loading annotations into memory...
0:00:00.097521
creating index...
index created!
Loading and preparing results...     
DONE (t=0.01s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 307086 tokens at 1408618.31 tokens per second.
PTBTokenizer tokenized 59618 tokens at 500900.36 tokens per second.
setting up scorers...
computing Bleu score...
{'testlen': 49574, 'reflen': 48715, 'guess': [49574, 44574, 39574, 34574], 'correct': [38260, 21639, 10988, 5392]}
ratio: 1.0176331725341063
Bleu_1: 0.772
Bleu_2: 0.612
Bleu_3: 0.470
Bleu_4: 0.357
computing METEOR score...
METEOR: 0.287
computing Rouge score...
ROUGE_L: 0.575
computing CIDEr score...
CIDEr: 1.217
computing SPICE score...
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.nustaq.serialization.FSTClazzInfo (file:/mnt/batch/tasks/shared/LS_root/jobs/vlp32gb/azureml/zhiyuan-pytorch-test_1647405376_841ce8bd/mounts/vig_data/jianfw/data/qd_data/coco_caption/pycocoevalcap/spice/lib/fst-2.47.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of org.nustaq.serialization.FSTClazzInfo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Parsing reference captions
Parsing test captions
Initiating Stanford parsing pipeline
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... 
done [0.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [1.0 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.4 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].
Threads( StanfordCoreNLP ) [3.642 seconds]
Warning: Nashorn engine is planned to be removed from a future JDK release
SPICE evaluation took: 24.43 s
SPICE: 0.220
2022-03-17 13:35:12,729.729 2829:tagger_caption_uni_pipeline_expanding.py:1010   evaluate(): evaluation result: {'Bleu_1': 0.7717755274942355, 'Bleu_2': 0.6121012842817737, 'Bleu_3': 0.47031095426348185, 'Bleu_4': 0.3568937024100879, 'METEOR': 0.28739524940208516, 'ROUGE_L': 0.5748194539091757, 'CIDEr': 1.217146191795672, 'SPICE': 0.2201676097785619}
2022-03-17 13:35:12,729.729 2829:tagger_caption_uni_pipeline_expanding.py:1011   evaluate(): evaluation result saved to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.report
2022-03-17 13:35:12,785.785 2829:qd_pytorch.py:1420 load_latest_parameters(): using output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/parameters_2022_03_16_04_43_35.yaml
2022-03-17 13:35:13,067.067 2829:uni_pipeline.py:841 _ensure_initialized(): initialized
2022-03-17 13:35:15,092.092 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json
2022-03-17 13:35:15,092.092 2829:modeling_utils.py:211 from_pretrained(): Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "image_captioning",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "TIMM_vit",
  "net": "vit_base_patch16_384",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pretrained": true,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-03-17 13:35:15,097.097 2829:tokenization_utils.py:170 _from_pretrained(): Model name './aux_data/untrained_config/VILT-L12-H784-uncased_16_384' not found in model shortcut name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc). Assuming './aux_data/untrained_config/VILT-L12-H784-uncased_16_384' is a path or url to a directory containing tokenizer files.
2022-03-17 13:35:15,098.098 2829:tokenization_utils.py:180 _from_pretrained(): Didn't find file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/added_tokens.json. We won't load it.
2022-03-17 13:35:15,098.098 2829:tokenization_utils.py:180 _from_pretrained(): Didn't find file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/special_tokens_map.json. We won't load it.
2022-03-17 13:35:15,098.098 2829:tokenization_utils.py:214 _from_pretrained(): loading file None
2022-03-17 13:35:15,098.098 2829:tokenization_utils.py:214 _from_pretrained(): loading file None
2022-03-17 13:35:15,098.098 2829:tokenization_utils.py:214 _from_pretrained(): loading file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/vocab.txt
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
2022-03-17 13:35:15,791.791 2829:modeling_bert.py:529   __init__(): TIMM Split image encoder load from pre-trained: True
2022-03-17 13:35:16,854.854 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:35:18,398.398 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:35:19,047.047 2829:modeling_bert.py:2677   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:35:19,197.197 2829:modeling_bert.py:2688   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:35:19,989.989 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight!  Note that this might be replaced by pre-trained checkpoint later!
2022-03-17 13:35:19,989.989 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode.
2022-03-17 13:35:21,118.118 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:35:21,665.665 2829:checkpoint.py:240       load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt
2022-03-17 13:35:23,674.674 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token                                loaded from image_encoder.module.cls_token                         of shape (1, 1, 768)
2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias                                loaded from image_encoder.module.head.bias                         of shape (1000,)
2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight                              loaded from image_encoder.module.head.weight                       of shape (1000, 768)
2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias                    loaded from image_encoder.module.patch_embed.proj.bias             of shape (768,)
2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight                  loaded from image_encoder.module.patch_embed.proj.weight           of shape (768, 3, 16, 16)
2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed                                loaded from image_encoder.module.pos_embed                         of shape (1, 577, 768)
2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias                         loaded from bert.caption_pooler.dense.bias                         of shape (768,)
2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight                       loaded from bert.caption_pooler.dense.weight                       of shape (768, 768)
2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias       loaded from bert.decoder.layer.0.attention.output.dense.bias       of shape (768,)
2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight     loaded from bert.decoder.layer.0.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias           loaded from bert.decoder.layer.0.attention.self.key.bias           of shape (768,)
2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight         loaded from bert.decoder.layer.0.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias         loaded from bert.decoder.layer.0.attention.self.query.bias         of shape (768,)
2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight       loaded from bert.decoder.layer.0.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias         loaded from bert.decoder.layer.0.attention.self.value.bias         of shape (768,)
2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight       loaded from bert.decoder.layer.0.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias           loaded from bert.decoder.layer.0.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight         loaded from bert.decoder.layer.0.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias             loaded from bert.decoder.layer.0.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight           loaded from bert.decoder.layer.0.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias                 loaded from bert.decoder.layer.0.output.dense.bias                 of shape (768,)
2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight               loaded from bert.decoder.layer.0.output.dense.weight               of shape (768, 3072)
2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias       loaded from bert.decoder.layer.1.attention.output.dense.bias       of shape (768,)
2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight     loaded from bert.decoder.layer.1.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias           loaded from bert.decoder.layer.1.attention.self.key.bias           of shape (768,)
2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight         loaded from bert.decoder.layer.1.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias         loaded from bert.decoder.layer.1.attention.self.query.bias         of shape (768,)
2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight       loaded from bert.decoder.layer.1.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias         loaded from bert.decoder.layer.1.attention.self.value.bias         of shape (768,)
2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight       loaded from bert.decoder.layer.1.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias           loaded from bert.decoder.layer.1.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight         loaded from bert.decoder.layer.1.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias             loaded from bert.decoder.layer.1.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight           loaded from bert.decoder.layer.1.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias                 loaded from bert.decoder.layer.1.output.dense.bias                 of shape (768,)
2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight               loaded from bert.decoder.layer.1.output.dense.weight               of shape (768, 3072)
2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias       loaded from bert.decoder.layer.2.attention.output.dense.bias       of shape (768,)
2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight     loaded from bert.decoder.layer.2.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias           loaded from bert.decoder.layer.2.attention.self.key.bias           of shape (768,)
2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight         loaded from bert.decoder.layer.2.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias         loaded from bert.decoder.layer.2.attention.self.query.bias         of shape (768,)
2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight       loaded from bert.decoder.layer.2.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias         loaded from bert.decoder.layer.2.attention.self.value.bias         of shape (768,)
2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight       loaded from bert.decoder.layer.2.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias           loaded from bert.decoder.layer.2.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight         loaded from bert.decoder.layer.2.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias             loaded from bert.decoder.layer.2.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight           loaded from bert.decoder.layer.2.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias                 loaded from bert.decoder.layer.2.output.dense.bias                 of shape (768,)
2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight               loaded from bert.decoder.layer.2.output.dense.weight               of shape (768, 3072)
2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias       loaded from bert.decoder.layer.3.attention.output.dense.bias       of shape (768,)
2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight     loaded from bert.decoder.layer.3.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias           loaded from bert.decoder.layer.3.attention.self.key.bias           of shape (768,)
2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight         loaded from bert.decoder.layer.3.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias         loaded from bert.decoder.layer.3.attention.self.query.bias         of shape (768,)
2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight       loaded from bert.decoder.layer.3.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias         loaded from bert.decoder.layer.3.attention.self.value.bias         of shape (768,)
2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight       loaded from bert.decoder.layer.3.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias           loaded from bert.decoder.layer.3.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight         loaded from bert.decoder.layer.3.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias             loaded from bert.decoder.layer.3.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight           loaded from bert.decoder.layer.3.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias                 loaded from bert.decoder.layer.3.output.dense.bias                 of shape (768,)
2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight               loaded from bert.decoder.layer.3.output.dense.weight               of shape (768, 3072)
2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias                         loaded from bert.embeddings.LayerNorm.bias                         of shape (768,)
2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight                       loaded from bert.embeddings.LayerNorm.weight                       of shape (768,)
2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight             loaded from bert.embeddings.position_embeddings.weight             of shape (512, 768)
2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight           loaded from bert.embeddings.token_type_embeddings.weight           of shape (2, 768)
2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight                 loaded from bert.embeddings.word_embeddings.weight                 of shape (30522, 768)
2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias                   loaded from bert.encoder.blocks.0.attn.proj.bias                   of shape (768,)
2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight                 loaded from bert.encoder.blocks.0.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias                    loaded from bert.encoder.blocks.0.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight                  loaded from bert.encoder.blocks.0.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias                     loaded from bert.encoder.blocks.0.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight                   loaded from bert.encoder.blocks.0.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias                     loaded from bert.encoder.blocks.0.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight                   loaded from bert.encoder.blocks.0.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias                       loaded from bert.encoder.blocks.0.norm1.bias                       of shape (768,)
2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight                     loaded from bert.encoder.blocks.0.norm1.weight                     of shape (768,)
2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias                       loaded from bert.encoder.blocks.0.norm2.bias                       of shape (768,)
2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight                     loaded from bert.encoder.blocks.0.norm2.weight                     of shape (768,)
2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias                   loaded from bert.encoder.blocks.1.attn.proj.bias                   of shape (768,)
2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight                 loaded from bert.encoder.blocks.1.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias                    loaded from bert.encoder.blocks.1.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight                  loaded from bert.encoder.blocks.1.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias                     loaded from bert.encoder.blocks.1.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight                   loaded from bert.encoder.blocks.1.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias                     loaded from bert.encoder.blocks.1.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight                   loaded from bert.encoder.blocks.1.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias                       loaded from bert.encoder.blocks.1.norm1.bias                       of shape (768,)
2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight                     loaded from bert.encoder.blocks.1.norm1.weight                     of shape (768,)
2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias                       loaded from bert.encoder.blocks.1.norm2.bias                       of shape (768,)
2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight                     loaded from bert.encoder.blocks.1.norm2.weight                     of shape (768,)
2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias                  loaded from bert.encoder.blocks.10.attn.proj.bias                  of shape (768,)
2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight                loaded from bert.encoder.blocks.10.attn.proj.weight                of shape (768, 768)
2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias                   loaded from bert.encoder.blocks.10.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight                 loaded from bert.encoder.blocks.10.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias                    loaded from bert.encoder.blocks.10.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight                  loaded from bert.encoder.blocks.10.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias                    loaded from bert.encoder.blocks.10.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight                  loaded from bert.encoder.blocks.10.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias                      loaded from bert.encoder.blocks.10.norm1.bias                      of shape (768,)
2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight                    loaded from bert.encoder.blocks.10.norm1.weight                    of shape (768,)
2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias                      loaded from bert.encoder.blocks.10.norm2.bias                      of shape (768,)
2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight                    loaded from bert.encoder.blocks.10.norm2.weight                    of shape (768,)
2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias                  loaded from bert.encoder.blocks.11.attn.proj.bias                  of shape (768,)
2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight                loaded from bert.encoder.blocks.11.attn.proj.weight                of shape (768, 768)
2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias                   loaded from bert.encoder.blocks.11.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight                 loaded from bert.encoder.blocks.11.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias                    loaded from bert.encoder.blocks.11.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight                  loaded from bert.encoder.blocks.11.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias                    loaded from bert.encoder.blocks.11.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight                  loaded from bert.encoder.blocks.11.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias                      loaded from bert.encoder.blocks.11.norm1.bias                      of shape (768,)
2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight                    loaded from bert.encoder.blocks.11.norm1.weight                    of shape (768,)
2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias                      loaded from bert.encoder.blocks.11.norm2.bias                      of shape (768,)
2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight                    loaded from bert.encoder.blocks.11.norm2.weight                    of shape (768,)
2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias                   loaded from bert.encoder.blocks.2.attn.proj.bias                   of shape (768,)
2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight                 loaded from bert.encoder.blocks.2.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias                    loaded from bert.encoder.blocks.2.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight                  loaded from bert.encoder.blocks.2.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias                     loaded from bert.encoder.blocks.2.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight                   loaded from bert.encoder.blocks.2.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias                     loaded from bert.encoder.blocks.2.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight                   loaded from bert.encoder.blocks.2.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias                       loaded from bert.encoder.blocks.2.norm1.bias                       of shape (768,)
2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight                     loaded from bert.encoder.blocks.2.norm1.weight                     of shape (768,)
2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias                       loaded from bert.encoder.blocks.2.norm2.bias                       of shape (768,)
2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight                     loaded from bert.encoder.blocks.2.norm2.weight                     of shape (768,)
2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias                   loaded from bert.encoder.blocks.3.attn.proj.bias                   of shape (768,)
2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight                 loaded from bert.encoder.blocks.3.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias                    loaded from bert.encoder.blocks.3.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight                  loaded from bert.encoder.blocks.3.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias                     loaded from bert.encoder.blocks.3.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight                   loaded from bert.encoder.blocks.3.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias                     loaded from bert.encoder.blocks.3.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight                   loaded from bert.encoder.blocks.3.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias                       loaded from bert.encoder.blocks.3.norm1.bias                       of shape (768,)
2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight                     loaded from bert.encoder.blocks.3.norm1.weight                     of shape (768,)
2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias                       loaded from bert.encoder.blocks.3.norm2.bias                       of shape (768,)
2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight                     loaded from bert.encoder.blocks.3.norm2.weight                     of shape (768,)
2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias                   loaded from bert.encoder.blocks.4.attn.proj.bias                   of shape (768,)
2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight                 loaded from bert.encoder.blocks.4.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias                    loaded from bert.encoder.blocks.4.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight                  loaded from bert.encoder.blocks.4.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias                     loaded from bert.encoder.blocks.4.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight                   loaded from bert.encoder.blocks.4.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias                     loaded from bert.encoder.blocks.4.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight                   loaded from bert.encoder.blocks.4.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias                       loaded from bert.encoder.blocks.4.norm1.bias                       of shape (768,)
2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight                     loaded from bert.encoder.blocks.4.norm1.weight                     of shape (768,)
2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias                       loaded from bert.encoder.blocks.4.norm2.bias                       of shape (768,)
2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight                     loaded from bert.encoder.blocks.4.norm2.weight                     of shape (768,)
2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias                   loaded from bert.encoder.blocks.5.attn.proj.bias                   of shape (768,)
2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight                 loaded from bert.encoder.blocks.5.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias                    loaded from bert.encoder.blocks.5.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight                  loaded from bert.encoder.blocks.5.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias                     loaded from bert.encoder.blocks.5.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight                   loaded from bert.encoder.blocks.5.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias                     loaded from bert.encoder.blocks.5.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight                   loaded from bert.encoder.blocks.5.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias                       loaded from bert.encoder.blocks.5.norm1.bias                       of shape (768,)
2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight                     loaded from bert.encoder.blocks.5.norm1.weight                     of shape (768,)
2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias                       loaded from bert.encoder.blocks.5.norm2.bias                       of shape (768,)
2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight                     loaded from bert.encoder.blocks.5.norm2.weight                     of shape (768,)
2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias                   loaded from bert.encoder.blocks.6.attn.proj.bias                   of shape (768,)
2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight                 loaded from bert.encoder.blocks.6.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias                    loaded from bert.encoder.blocks.6.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight                  loaded from bert.encoder.blocks.6.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias                     loaded from bert.encoder.blocks.6.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight                   loaded from bert.encoder.blocks.6.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias                     loaded from bert.encoder.blocks.6.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight                   loaded from bert.encoder.blocks.6.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias                       loaded from bert.encoder.blocks.6.norm1.bias                       of shape (768,)
2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight                     loaded from bert.encoder.blocks.6.norm1.weight                     of shape (768,)
2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias                       loaded from bert.encoder.blocks.6.norm2.bias                       of shape (768,)
2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight                     loaded from bert.encoder.blocks.6.norm2.weight                     of shape (768,)
2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias                   loaded from bert.encoder.blocks.7.attn.proj.bias                   of shape (768,)
2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight                 loaded from bert.encoder.blocks.7.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias                    loaded from bert.encoder.blocks.7.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight                  loaded from bert.encoder.blocks.7.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias                     loaded from bert.encoder.blocks.7.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight                   loaded from bert.encoder.blocks.7.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias                     loaded from bert.encoder.blocks.7.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight                   loaded from bert.encoder.blocks.7.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias                       loaded from bert.encoder.blocks.7.norm1.bias                       of shape (768,)
2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight                     loaded from bert.encoder.blocks.7.norm1.weight                     of shape (768,)
2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias                       loaded from bert.encoder.blocks.7.norm2.bias                       of shape (768,)
2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight                     loaded from bert.encoder.blocks.7.norm2.weight                     of shape (768,)
2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias                   loaded from bert.encoder.blocks.8.attn.proj.bias                   of shape (768,)
2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight                 loaded from bert.encoder.blocks.8.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias                    loaded from bert.encoder.blocks.8.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight                  loaded from bert.encoder.blocks.8.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias                     loaded from bert.encoder.blocks.8.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight                   loaded from bert.encoder.blocks.8.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias                     loaded from bert.encoder.blocks.8.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight                   loaded from bert.encoder.blocks.8.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias                       loaded from bert.encoder.blocks.8.norm1.bias                       of shape (768,)
2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight                     loaded from bert.encoder.blocks.8.norm1.weight                     of shape (768,)
2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias                       loaded from bert.encoder.blocks.8.norm2.bias                       of shape (768,)
2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight                     loaded from bert.encoder.blocks.8.norm2.weight                     of shape (768,)
2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias                   loaded from bert.encoder.blocks.9.attn.proj.bias                   of shape (768,)
2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight                 loaded from bert.encoder.blocks.9.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias                    loaded from bert.encoder.blocks.9.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight                  loaded from bert.encoder.blocks.9.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias                     loaded from bert.encoder.blocks.9.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight                   loaded from bert.encoder.blocks.9.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias                     loaded from bert.encoder.blocks.9.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight                   loaded from bert.encoder.blocks.9.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias                       loaded from bert.encoder.blocks.9.norm1.bias                       of shape (768,)
2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight                     loaded from bert.encoder.blocks.9.norm1.weight                     of shape (768,)
2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias                       loaded from bert.encoder.blocks.9.norm2.bias                       of shape (768,)
2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight                     loaded from bert.encoder.blocks.9.norm2.weight                     of shape (768,)
2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias               loaded from bert.encoder.tag_blocks.0.attn.proj.bias               of shape (768,)
2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight             loaded from bert.encoder.tag_blocks.0.attn.proj.weight             of shape (768, 768)
2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias                loaded from bert.encoder.tag_blocks.0.attn.qkv.bias                of shape (2304,)
2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight              loaded from bert.encoder.tag_blocks.0.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias                   loaded from bert.encoder.tag_blocks.0.norm1.bias                   of shape (768,)
2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight                 loaded from bert.encoder.tag_blocks.0.norm1.weight                 of shape (768,)
2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias                   loaded from bert.encoder.tag_blocks.0.norm2.bias                   of shape (768,)
2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight                 loaded from bert.encoder.tag_blocks.0.norm2.weight                 of shape (768,)
2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias               loaded from bert.encoder.tag_blocks.1.attn.proj.bias               of shape (768,)
2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight             loaded from bert.encoder.tag_blocks.1.attn.proj.weight             of shape (768, 768)
2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias                loaded from bert.encoder.tag_blocks.1.attn.qkv.bias                of shape (2304,)
2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight              loaded from bert.encoder.tag_blocks.1.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias                   loaded from bert.encoder.tag_blocks.1.norm1.bias                   of shape (768,)
2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight                 loaded from bert.encoder.tag_blocks.1.norm1.weight                 of shape (768,)
2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias                   loaded from bert.encoder.tag_blocks.1.norm2.bias                   of shape (768,)
2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight                 loaded from bert.encoder.tag_blocks.1.norm2.weight                 of shape (768,)
2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias               loaded from bert.encoder.tag_blocks.2.attn.proj.bias               of shape (768,)
2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight             loaded from bert.encoder.tag_blocks.2.attn.proj.weight             of shape (768, 768)
2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias                loaded from bert.encoder.tag_blocks.2.attn.qkv.bias                of shape (2304,)
2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight              loaded from bert.encoder.tag_blocks.2.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias                   loaded from bert.encoder.tag_blocks.2.norm1.bias                   of shape (768,)
2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight                 loaded from bert.encoder.tag_blocks.2.norm1.weight                 of shape (768,)
2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias                   loaded from bert.encoder.tag_blocks.2.norm2.bias                   of shape (768,)
2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight                 loaded from bert.encoder.tag_blocks.2.norm2.weight                 of shape (768,)
2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias               loaded from bert.encoder.tag_blocks.3.attn.proj.bias               of shape (768,)
2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight             loaded from bert.encoder.tag_blocks.3.attn.proj.weight             of shape (768, 768)
2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias                loaded from bert.encoder.tag_blocks.3.attn.qkv.bias                of shape (2304,)
2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight              loaded from bert.encoder.tag_blocks.3.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias                   loaded from bert.encoder.tag_blocks.3.norm1.bias                   of shape (768,)
2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight                 loaded from bert.encoder.tag_blocks.3.norm1.weight                 of shape (768,)
2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias                   loaded from bert.encoder.tag_blocks.3.norm2.bias                   of shape (768,)
2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight                 loaded from bert.encoder.tag_blocks.3.norm2.weight                 of shape (768,)
2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias                   loaded from bert.extra_embeddings.LayerNorm.bias                   of shape (768,)
2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight                 loaded from bert.extra_embeddings.LayerNorm.weight                 of shape (768,)
2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight       loaded from bert.extra_embeddings.position_embeddings.weight       of shape (512, 768)
2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight     loaded from bert.extra_embeddings.token_type_embeddings.weight     of shape (2, 768)
2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight           loaded from bert.extra_embeddings.word_embeddings.weight           of shape (30522, 768)
2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias                                 loaded from bert.pooler.dense.bias                                 of shape (768,)
2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight                               loaded from bert.pooler.dense.weight                               of shape (768, 768)
2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias                        loaded from bert.tag_logit.predictions.bias                        of shape (30522,)
2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight              loaded from bert.tag_logit.predictions.decoder.weight              of shape (30522, 768)
2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias    loaded from bert.tag_logit.predictions.transform.LayerNorm.bias    of shape (768,)
2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight  loaded from bert.tag_logit.predictions.transform.LayerNorm.weight  of shape (768,)
2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias        loaded from bert.tag_logit.predictions.transform.dense.bias        of shape (768,)
2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight      loaded from bert.tag_logit.predictions.transform.dense.weight      of shape (768, 768)
2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias                                   loaded from cls.predictions.bias                                   of shape (30522,)
2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight                         loaded from cls.predictions.decoder.weight                         of shape (30522, 768)
2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias               loaded from cls.predictions.transform.LayerNorm.bias               of shape (768,)
2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight             loaded from cls.predictions.transform.LayerNorm.weight             of shape (768,)
2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias                   loaded from cls.predictions.transform.dense.bias                   of shape (768,)
2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight                 loaded from cls.predictions.transform.dense.weight                 of shape (768, 768)
2022-03-17 13:35:23,696.696 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288
2022-03-17 13:35:23,696.696 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = []
2022-03-17 13:35:23,699.699 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1
2022-03-17 13:35:23,857.857 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight']
2022-03-17 13:35:24,162.162 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = <src.qd.data_layer.transform.LoadImage object at 0x7fcbce000050>; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False
2022-03-17 13:35:24,188.188 2829:uni_pipeline.py:509 get_data_loader(): sampler = <src.qd.data_layer.samplers.DistributedSampler object at 0x7fcc84799a90>
2022-03-17 13:35:24,188.188 2829:uni_pipeline.py:1000    predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:35:24,262.262 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=<src.qd.data_layer.dataset.ImageIdxTSVDataset object at 0x7fcbcd69fc50>, transform=Compose(
    <src.qd.data_layer.transform.LoadHW object at 0x7fcc84611410>
    Compose(
    <src.qd.data_layer.transform.LoadImage object at 0x7fcbce000050>
    ImageTransform2Dict(image_transform=Compose(
    <src.qd.data_layer.transform.BGR2RGB object at 0x7fcbcd69fc10>
    ToPILImage()
    Resize(size=384, interpolation=PIL.Image.BICUBIC)
    CenterCrop(size=(384, 384))
    ToTensor()
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
))
)
    LoadLabel(data=TaxCocoCaption, split=test, version=vinvl)
    <src.qd.data_layer.transform.IdentifyTextAB object at 0x7fcbcd9b5790>
    TransCaptionTensorizer(tensorizer=<src.qd.mask.data.datasets.caption_tensorizer.CaptionTensorizer object at 0x7fcbcd9b5710>, pad_to_max=True, pad_image_to_max=True)
    <src.qd.pipelines.tagger_caption_uni_pipeline_expanding.Tensorizer object at 0x7fcbcd9b5890>
    <src.qd.data_layer.transform.RemoveUselessKeys object at 0x7fcbcd9b57d0>
    <src.qd.data_layer.transform.RenameKey object at 0x7fcc847999d0>
))

uni_pipeline.py:908:   0%|          | 0/10 [00:00<?, ?it/s]/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1639: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

uni_pipeline.py:908:  10%|█         | 1/10 [00:05<00:46,  5.19s/it]
uni_pipeline.py:908:  20%|██        | 2/10 [00:10<00:39,  4.98s/it]
uni_pipeline.py:908:  30%|███       | 3/10 [00:14<00:34,  4.91s/it]
uni_pipeline.py:908:  40%|████      | 4/10 [00:19<00:29,  4.88s/it]
uni_pipeline.py:908:  50%|█████     | 5/10 [00:24<00:24,  4.99s/it]
uni_pipeline.py:908:  60%|██████    | 6/10 [00:29<00:19,  4.83s/it]
uni_pipeline.py:908:  70%|███████   | 7/10 [00:34<00:14,  4.83s/it]
uni_pipeline.py:908:  80%|████████  | 8/10 [00:38<00:09,  4.62s/it]
uni_pipeline.py:908:  90%|█████████ | 9/10 [00:43<00:04,  4.91s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:48<00:00,  4.86s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:50<00:00,  5.00s/it]
2022-03-17 13:36:15,869.869 2829:qd_common.py:625    cmd_run(): start to cmd run: nvidia-smi
2022-03-17 13:36:18,098.098 2829:qd_common.py:670    cmd_run(): finished the cmd run
2022-03-17 13:36:18,690.690 2829:uni_pipeline.py:1008    predict(): data: 0.0003 (0.0575)  input_to_cuda: 0.0029 (0.0030)  model: 4.8217 (4.8260)  write: 0.0044 (0.0044)
2022-03-17 13:36:43,266.266 2829:qd_common.py:1157 concat_files(): concating 0/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:36:43,267.267 2829:qd_common.py:1157 concat_files(): concating 1/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_1_32.tsv
2022-03-17 13:36:43,267.267 2829:qd_common.py:1157 concat_files(): concating 2/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_2_32.tsv
2022-03-17 13:36:43,268.268 2829:qd_common.py:1157 concat_files(): concating 3/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_3_32.tsv
2022-03-17 13:36:43,269.269 2829:qd_common.py:1157 concat_files(): concating 4/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_4_32.tsv
2022-03-17 13:36:43,269.269 2829:qd_common.py:1157 concat_files(): concating 5/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_5_32.tsv
2022-03-17 13:36:43,270.270 2829:qd_common.py:1157 concat_files(): concating 6/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_6_32.tsv
2022-03-17 13:36:43,270.270 2829:qd_common.py:1157 concat_files(): concating 7/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_7_32.tsv
2022-03-17 13:36:43,271.271 2829:qd_common.py:1157 concat_files(): concating 8/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_8_32.tsv
2022-03-17 13:36:43,342.342 2829:qd_common.py:1157 concat_files(): concating 9/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_9_32.tsv
2022-03-17 13:36:43,416.416 2829:qd_common.py:1157 concat_files(): concating 10/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_10_32.tsv
2022-03-17 13:36:43,489.489 2829:qd_common.py:1157 concat_files(): concating 11/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_11_32.tsv
2022-03-17 13:36:43,563.563 2829:qd_common.py:1157 concat_files(): concating 12/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_12_32.tsv
2022-03-17 13:36:43,635.635 2829:qd_common.py:1157 concat_files(): concating 13/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_13_32.tsv
2022-03-17 13:36:43,713.713 2829:qd_common.py:1157 concat_files(): concating 14/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_14_32.tsv
2022-03-17 13:36:43,785.785 2829:qd_common.py:1157 concat_files(): concating 15/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_15_32.tsv
2022-03-17 13:36:43,858.858 2829:qd_common.py:1157 concat_files(): concating 16/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_16_32.tsv
2022-03-17 13:36:43,929.929 2829:qd_common.py:1157 concat_files(): concating 17/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_17_32.tsv
2022-03-17 13:36:43,999.999 2829:qd_common.py:1157 concat_files(): concating 18/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_18_32.tsv
2022-03-17 13:36:44,070.070 2829:qd_common.py:1157 concat_files(): concating 19/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_19_32.tsv
2022-03-17 13:36:44,142.142 2829:qd_common.py:1157 concat_files(): concating 20/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_20_32.tsv
2022-03-17 13:36:44,212.212 2829:qd_common.py:1157 concat_files(): concating 21/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_21_32.tsv
2022-03-17 13:36:44,283.283 2829:qd_common.py:1157 concat_files(): concating 22/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_22_32.tsv
2022-03-17 13:36:44,356.356 2829:qd_common.py:1157 concat_files(): concating 23/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_23_32.tsv
2022-03-17 13:36:44,426.426 2829:qd_common.py:1157 concat_files(): concating 24/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_24_32.tsv
2022-03-17 13:36:44,497.497 2829:qd_common.py:1157 concat_files(): concating 25/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_25_32.tsv
2022-03-17 13:36:44,568.568 2829:qd_common.py:1157 concat_files(): concating 26/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_26_32.tsv
2022-03-17 13:36:44,639.639 2829:qd_common.py:1157 concat_files(): concating 27/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_27_32.tsv
2022-03-17 13:36:44,712.712 2829:qd_common.py:1157 concat_files(): concating 28/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_28_32.tsv
2022-03-17 13:36:44,783.783 2829:qd_common.py:1157 concat_files(): concating 29/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_29_32.tsv
2022-03-17 13:36:44,854.854 2829:qd_common.py:1157 concat_files(): concating 30/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_30_32.tsv
2022-03-17 13:36:44,924.924 2829:qd_common.py:1157 concat_files(): concating 31/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_31_32.tsv
2022-03-17 13:36:47,211.211 2829:tsv_io.py:66 reorder_tsv_keys(): loading keys in input
2022-03-17 13:36:47,237.237 2829:tsv_io.py:328 _ensure_lineidx_loaded(): loaded 5024 from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv.before.reorder.lineidx

tsv_io.py:67:   0%|          | 0/5024 [00:00<?, ?it/s]
tsv_io.py:67: 100%|██████████| 5024/5024 [00:00<00:00, 101880.67it/s]
2022-03-17 13:36:47,360.360 2829:tsv_io.py:70   gen_rows(): writing

tsv_io.py:71:   0%|          | 0/5000 [00:00<?, ?it/s]
tsv_io.py:71: 100%|██████████| 5000/5000 [00:00<00:00, 73174.50it/s]
2022-03-17 13:37:04,470.470 2829:qd_common.py:625    cmd_run(): start to cmd run: rsync output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv.speed.vis.txt.tmp --progress
model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt

         32,768  33%    0.00kB/s    0:00:00  
         98,037 100%   62.25MB/s    0:00:00 (xfr#1, to-chk=0/1)
loading annotations into memory...
0:00:00.022251
creating index...
index created!
Loading and preparing results...     
DONE (t=0.01s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 307086 tokens at 1421150.39 tokens per second.
PTBTokenizer tokenized 57240 tokens at 502471.48 tokens per second.
setting up scorers...
computing Bleu score...
{'testlen': 47401, 'reflen': 47176, 'guess': [47401, 42401, 37401, 32401], 'correct': [35350, 18865, 8593, 3701]}
ratio: 1.004769374258076
Bleu_1: 0.746
Bleu_2: 0.576
Bleu_3: 0.424
Bleu_4: 0.305
computing METEOR score...
METEOR: 0.256
computing Rouge score...
ROUGE_L: 0.544
computing CIDEr score...
CIDEr: 0.994
computing SPICE score...
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.nustaq.serialization.FSTClazzInfo (file:/mnt/batch/tasks/shared/LS_root/jobs/vlp32gb/azureml/zhiyuan-pytorch-test_1647405376_841ce8bd/mounts/vig_data/jianfw/data/qd_data/coco_caption/pycocoevalcap/spice/lib/fst-2.47.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of org.nustaq.serialization.FSTClazzInfo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Parsing reference captions
Parsing test captions
Initiating Stanford parsing pipeline
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... 
done [0.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [0.9 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.4 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].
Threads( StanfordCoreNLP ) [2.773 seconds]
Warning: Nashorn engine is planned to be removed from a future JDK release
SPICE evaluation took: 16.20 s
SPICE: 0.191
2022-03-17 13:37:43,575.575 2829:tagger_caption_uni_pipeline_expanding.py:1010   evaluate(): evaluation result: {'Bleu_1': 0.7457648572814762, 'Bleu_2': 0.5760249730043402, 'Bleu_3': 0.4240151824283833, 'Bleu_4': 0.3054753867064872, 'METEOR': 0.2557913749315903, 'ROUGE_L': 0.5437660331881774, 'CIDEr': 0.994307981124798, 'SPICE': 0.1914314602347826}
2022-03-17 13:37:43,575.575 2829:tagger_caption_uni_pipeline_expanding.py:1011   evaluate(): evaluation result saved to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.report
2022-03-17 13:37:43,601.601 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json
2022-03-17 13:37:43,601.601 2829:modeling_utils.py:211 from_pretrained(): Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "image_captioning",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "TIMM_vit",
  "net": "vit_base_patch16_384",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pretrained": true,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
2022-03-17 13:37:44,284.284 2829:modeling_bert.py:529   __init__(): TIMM Split image encoder load from pre-trained: True
2022-03-17 13:37:45,355.355 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:37:46,911.911 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:37:47,559.559 2829:modeling_bert.py:2677   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:37:47,707.707 2829:modeling_bert.py:2688   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:37:48,499.499 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight!  Note that this might be replaced by pre-trained checkpoint later!
2022-03-17 13:37:48,499.499 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode.
2022-03-17 13:37:49,635.635 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:37:50,179.179 2829:checkpoint.py:240       load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt
2022-03-17 13:37:58,151.151 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token                                loaded from image_encoder.module.cls_token                         of shape (1, 1, 768)
2022-03-17 13:37:58,151.151 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias                                loaded from image_encoder.module.head.bias                         of shape (1000,)
2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight                              loaded from image_encoder.module.head.weight                       of shape (1000, 768)
2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias                    loaded from image_encoder.module.patch_embed.proj.bias             of shape (768,)
2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight                  loaded from image_encoder.module.patch_embed.proj.weight           of shape (768, 3, 16, 16)
2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed                                loaded from image_encoder.module.pos_embed                         of shape (1, 577, 768)
2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias                         loaded from bert.caption_pooler.dense.bias                         of shape (768,)
2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight                       loaded from bert.caption_pooler.dense.weight                       of shape (768, 768)
2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias       loaded from bert.decoder.layer.0.attention.output.dense.bias       of shape (768,)
2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight     loaded from bert.decoder.layer.0.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias           loaded from bert.decoder.layer.0.attention.self.key.bias           of shape (768,)
2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight         loaded from bert.decoder.layer.0.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias         loaded from bert.decoder.layer.0.attention.self.query.bias         of shape (768,)
2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight       loaded from bert.decoder.layer.0.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias         loaded from bert.decoder.layer.0.attention.self.value.bias         of shape (768,)
2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight       loaded from bert.decoder.layer.0.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias           loaded from bert.decoder.layer.0.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight         loaded from bert.decoder.layer.0.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias             loaded from bert.decoder.layer.0.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight           loaded from bert.decoder.layer.0.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias                 loaded from bert.decoder.layer.0.output.dense.bias                 of shape (768,)
2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight               loaded from bert.decoder.layer.0.output.dense.weight               of shape (768, 3072)
2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias       loaded from bert.decoder.layer.1.attention.output.dense.bias       of shape (768,)
2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight     loaded from bert.decoder.layer.1.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias           loaded from bert.decoder.layer.1.attention.self.key.bias           of shape (768,)
2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight         loaded from bert.decoder.layer.1.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias         loaded from bert.decoder.layer.1.attention.self.query.bias         of shape (768,)
2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight       loaded from bert.decoder.layer.1.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias         loaded from bert.decoder.layer.1.attention.self.value.bias         of shape (768,)
2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight       loaded from bert.decoder.layer.1.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias           loaded from bert.decoder.layer.1.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight         loaded from bert.decoder.layer.1.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias             loaded from bert.decoder.layer.1.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight           loaded from bert.decoder.layer.1.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias                 loaded from bert.decoder.layer.1.output.dense.bias                 of shape (768,)
2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight               loaded from bert.decoder.layer.1.output.dense.weight               of shape (768, 3072)
2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias       loaded from bert.decoder.layer.2.attention.output.dense.bias       of shape (768,)
2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight     loaded from bert.decoder.layer.2.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias           loaded from bert.decoder.layer.2.attention.self.key.bias           of shape (768,)
2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight         loaded from bert.decoder.layer.2.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias         loaded from bert.decoder.layer.2.attention.self.query.bias         of shape (768,)
2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight       loaded from bert.decoder.layer.2.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias         loaded from bert.decoder.layer.2.attention.self.value.bias         of shape (768,)
2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight       loaded from bert.decoder.layer.2.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias           loaded from bert.decoder.layer.2.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight         loaded from bert.decoder.layer.2.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias             loaded from bert.decoder.layer.2.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight           loaded from bert.decoder.layer.2.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias                 loaded from bert.decoder.layer.2.output.dense.bias                 of shape (768,)
2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight               loaded from bert.decoder.layer.2.output.dense.weight               of shape (768, 3072)
2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias       loaded from bert.decoder.layer.3.attention.output.dense.bias       of shape (768,)
2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight     loaded from bert.decoder.layer.3.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias           loaded from bert.decoder.layer.3.attention.self.key.bias           of shape (768,)
2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight         loaded from bert.decoder.layer.3.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias         loaded from bert.decoder.layer.3.attention.self.query.bias         of shape (768,)
2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight       loaded from bert.decoder.layer.3.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias         loaded from bert.decoder.layer.3.attention.self.value.bias         of shape (768,)
2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight       loaded from bert.decoder.layer.3.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias           loaded from bert.decoder.layer.3.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight         loaded from bert.decoder.layer.3.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias             loaded from bert.decoder.layer.3.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight           loaded from bert.decoder.layer.3.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias                 loaded from bert.decoder.layer.3.output.dense.bias                 of shape (768,)
2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight               loaded from bert.decoder.layer.3.output.dense.weight               of shape (768, 3072)
2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias                         loaded from bert.embeddings.LayerNorm.bias                         of shape (768,)
2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight                       loaded from bert.embeddings.LayerNorm.weight                       of shape (768,)
2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight             loaded from bert.embeddings.position_embeddings.weight             of shape (512, 768)
2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight           loaded from bert.embeddings.token_type_embeddings.weight           of shape (2, 768)
2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight                 loaded from bert.embeddings.word_embeddings.weight                 of shape (30522, 768)
2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias                   loaded from bert.encoder.blocks.0.attn.proj.bias                   of shape (768,)
2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight                 loaded from bert.encoder.blocks.0.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias                    loaded from bert.encoder.blocks.0.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight                  loaded from bert.encoder.blocks.0.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias                     loaded from bert.encoder.blocks.0.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight                   loaded from bert.encoder.blocks.0.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias                     loaded from bert.encoder.blocks.0.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight                   loaded from bert.encoder.blocks.0.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias                       loaded from bert.encoder.blocks.0.norm1.bias                       of shape (768,)
2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight                     loaded from bert.encoder.blocks.0.norm1.weight                     of shape (768,)
2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias                       loaded from bert.encoder.blocks.0.norm2.bias                       of shape (768,)
2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight                     loaded from bert.encoder.blocks.0.norm2.weight                     of shape (768,)
2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias                   loaded from bert.encoder.blocks.1.attn.proj.bias                   of shape (768,)
2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight                 loaded from bert.encoder.blocks.1.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias                    loaded from bert.encoder.blocks.1.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight                  loaded from bert.encoder.blocks.1.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias                     loaded from bert.encoder.blocks.1.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight                   loaded from bert.encoder.blocks.1.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias                     loaded from bert.encoder.blocks.1.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight                   loaded from bert.encoder.blocks.1.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias                       loaded from bert.encoder.blocks.1.norm1.bias                       of shape (768,)
2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight                     loaded from bert.encoder.blocks.1.norm1.weight                     of shape (768,)
2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias                       loaded from bert.encoder.blocks.1.norm2.bias                       of shape (768,)
2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight                     loaded from bert.encoder.blocks.1.norm2.weight                     of shape (768,)
2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias                  loaded from bert.encoder.blocks.10.attn.proj.bias                  of shape (768,)
2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight                loaded from bert.encoder.blocks.10.attn.proj.weight                of shape (768, 768)
2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias                   loaded from bert.encoder.blocks.10.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight                 loaded from bert.encoder.blocks.10.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias                    loaded from bert.encoder.blocks.10.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight                  loaded from bert.encoder.blocks.10.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias                    loaded from bert.encoder.blocks.10.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight                  loaded from bert.encoder.blocks.10.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias                      loaded from bert.encoder.blocks.10.norm1.bias                      of shape (768,)
2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight                    loaded from bert.encoder.blocks.10.norm1.weight                    of shape (768,)
2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias                      loaded from bert.encoder.blocks.10.norm2.bias                      of shape (768,)
2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight                    loaded from bert.encoder.blocks.10.norm2.weight                    of shape (768,)
2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias                  loaded from bert.encoder.blocks.11.attn.proj.bias                  of shape (768,)
2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight                loaded from bert.encoder.blocks.11.attn.proj.weight                of shape (768, 768)
2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias                   loaded from bert.encoder.blocks.11.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight                 loaded from bert.encoder.blocks.11.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias                    loaded from bert.encoder.blocks.11.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight                  loaded from bert.encoder.blocks.11.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias                    loaded from bert.encoder.blocks.11.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight                  loaded from bert.encoder.blocks.11.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias                      loaded from bert.encoder.blocks.11.norm1.bias                      of shape (768,)
2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight                    loaded from bert.encoder.blocks.11.norm1.weight                    of shape (768,)
2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias                      loaded from bert.encoder.blocks.11.norm2.bias                      of shape (768,)
2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight                    loaded from bert.encoder.blocks.11.norm2.weight                    of shape (768,)
2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias                   loaded from bert.encoder.blocks.2.attn.proj.bias                   of shape (768,)
2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight                 loaded from bert.encoder.blocks.2.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias                    loaded from bert.encoder.blocks.2.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight                  loaded from bert.encoder.blocks.2.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias                     loaded from bert.encoder.blocks.2.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight                   loaded from bert.encoder.blocks.2.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias                     loaded from bert.encoder.blocks.2.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight                   loaded from bert.encoder.blocks.2.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias                       loaded from bert.encoder.blocks.2.norm1.bias                       of shape (768,)
2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight                     loaded from bert.encoder.blocks.2.norm1.weight                     of shape (768,)
2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias                       loaded from bert.encoder.blocks.2.norm2.bias                       of shape (768,)
2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight                     loaded from bert.encoder.blocks.2.norm2.weight                     of shape (768,)
2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias                   loaded from bert.encoder.blocks.3.attn.proj.bias                   of shape (768,)
2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight                 loaded from bert.encoder.blocks.3.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias                    loaded from bert.encoder.blocks.3.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight                  loaded from bert.encoder.blocks.3.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias                     loaded from bert.encoder.blocks.3.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight                   loaded from bert.encoder.blocks.3.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias                     loaded from bert.encoder.blocks.3.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight                   loaded from bert.encoder.blocks.3.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias                       loaded from bert.encoder.blocks.3.norm1.bias                       of shape (768,)
2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight                     loaded from bert.encoder.blocks.3.norm1.weight                     of shape (768,)
2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias                       loaded from bert.encoder.blocks.3.norm2.bias                       of shape (768,)
2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight                     loaded from bert.encoder.blocks.3.norm2.weight                     of shape (768,)
2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias                   loaded from bert.encoder.blocks.4.attn.proj.bias                   of shape (768,)
2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight                 loaded from bert.encoder.blocks.4.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias                    loaded from bert.encoder.blocks.4.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight                  loaded from bert.encoder.blocks.4.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias                     loaded from bert.encoder.blocks.4.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight                   loaded from bert.encoder.blocks.4.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias                     loaded from bert.encoder.blocks.4.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight                   loaded from bert.encoder.blocks.4.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias                       loaded from bert.encoder.blocks.4.norm1.bias                       of shape (768,)
2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight                     loaded from bert.encoder.blocks.4.norm1.weight                     of shape (768,)
2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias                       loaded from bert.encoder.blocks.4.norm2.bias                       of shape (768,)
2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight                     loaded from bert.encoder.blocks.4.norm2.weight                     of shape (768,)
2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias                   loaded from bert.encoder.blocks.5.attn.proj.bias                   of shape (768,)
2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight                 loaded from bert.encoder.blocks.5.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias                    loaded from bert.encoder.blocks.5.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight                  loaded from bert.encoder.blocks.5.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias                     loaded from bert.encoder.blocks.5.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight                   loaded from bert.encoder.blocks.5.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias                     loaded from bert.encoder.blocks.5.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight                   loaded from bert.encoder.blocks.5.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias                       loaded from bert.encoder.blocks.5.norm1.bias                       of shape (768,)
2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight                     loaded from bert.encoder.blocks.5.norm1.weight                     of shape (768,)
2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias                       loaded from bert.encoder.blocks.5.norm2.bias                       of shape (768,)
2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight                     loaded from bert.encoder.blocks.5.norm2.weight                     of shape (768,)
2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias                   loaded from bert.encoder.blocks.6.attn.proj.bias                   of shape (768,)
2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight                 loaded from bert.encoder.blocks.6.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias                    loaded from bert.encoder.blocks.6.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight                  loaded from bert.encoder.blocks.6.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias                     loaded from bert.encoder.blocks.6.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight                   loaded from bert.encoder.blocks.6.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias                     loaded from bert.encoder.blocks.6.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight                   loaded from bert.encoder.blocks.6.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias                       loaded from bert.encoder.blocks.6.norm1.bias                       of shape (768,)
2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight                     loaded from bert.encoder.blocks.6.norm1.weight                     of shape (768,)
2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias                       loaded from bert.encoder.blocks.6.norm2.bias                       of shape (768,)
2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight                     loaded from bert.encoder.blocks.6.norm2.weight                     of shape (768,)
2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias                   loaded from bert.encoder.blocks.7.attn.proj.bias                   of shape (768,)
2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight                 loaded from bert.encoder.blocks.7.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias                    loaded from bert.encoder.blocks.7.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight                  loaded from bert.encoder.blocks.7.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias                     loaded from bert.encoder.blocks.7.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight                   loaded from bert.encoder.blocks.7.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias                     loaded from bert.encoder.blocks.7.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight                   loaded from bert.encoder.blocks.7.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias                       loaded from bert.encoder.blocks.7.norm1.bias                       of shape (768,)
2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight                     loaded from bert.encoder.blocks.7.norm1.weight                     of shape (768,)
2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias                       loaded from bert.encoder.blocks.7.norm2.bias                       of shape (768,)
2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight                     loaded from bert.encoder.blocks.7.norm2.weight                     of shape (768,)
2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias                   loaded from bert.encoder.blocks.8.attn.proj.bias                   of shape (768,)
2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight                 loaded from bert.encoder.blocks.8.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias                    loaded from bert.encoder.blocks.8.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight                  loaded from bert.encoder.blocks.8.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias                     loaded from bert.encoder.blocks.8.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight                   loaded from bert.encoder.blocks.8.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias                     loaded from bert.encoder.blocks.8.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight                   loaded from bert.encoder.blocks.8.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias                       loaded from bert.encoder.blocks.8.norm1.bias                       of shape (768,)
2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight                     loaded from bert.encoder.blocks.8.norm1.weight                     of shape (768,)
2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias                       loaded from bert.encoder.blocks.8.norm2.bias                       of shape (768,)
2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight                     loaded from bert.encoder.blocks.8.norm2.weight                     of shape (768,)
2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias                   loaded from bert.encoder.blocks.9.attn.proj.bias                   of shape (768,)
2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight                 loaded from bert.encoder.blocks.9.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias                    loaded from bert.encoder.blocks.9.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight                  loaded from bert.encoder.blocks.9.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias                     loaded from bert.encoder.blocks.9.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight                   loaded from bert.encoder.blocks.9.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias                     loaded from bert.encoder.blocks.9.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight                   loaded from bert.encoder.blocks.9.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias                       loaded from bert.encoder.blocks.9.norm1.bias                       of shape (768,)
2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight                     loaded from bert.encoder.blocks.9.norm1.weight                     of shape (768,)
2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias                       loaded from bert.encoder.blocks.9.norm2.bias                       of shape (768,)
2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight                     loaded from bert.encoder.blocks.9.norm2.weight                     of shape (768,)
2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias               loaded from bert.encoder.tag_blocks.0.attn.proj.bias               of shape (768,)
2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight             loaded from bert.encoder.tag_blocks.0.attn.proj.weight             of shape (768, 768)
2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias                loaded from bert.encoder.tag_blocks.0.attn.qkv.bias                of shape (2304,)
2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight              loaded from bert.encoder.tag_blocks.0.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias                   loaded from bert.encoder.tag_blocks.0.norm1.bias                   of shape (768,)
2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight                 loaded from bert.encoder.tag_blocks.0.norm1.weight                 of shape (768,)
2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias                   loaded from bert.encoder.tag_blocks.0.norm2.bias                   of shape (768,)
2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight                 loaded from bert.encoder.tag_blocks.0.norm2.weight                 of shape (768,)
2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias               loaded from bert.encoder.tag_blocks.1.attn.proj.bias               of shape (768,)
2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight             loaded from bert.encoder.tag_blocks.1.attn.proj.weight             of shape (768, 768)
2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias                loaded from bert.encoder.tag_blocks.1.attn.qkv.bias                of shape (2304,)
2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight              loaded from bert.encoder.tag_blocks.1.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias                   loaded from bert.encoder.tag_blocks.1.norm1.bias                   of shape (768,)
2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight                 loaded from bert.encoder.tag_blocks.1.norm1.weight                 of shape (768,)
2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias                   loaded from bert.encoder.tag_blocks.1.norm2.bias                   of shape (768,)
2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight                 loaded from bert.encoder.tag_blocks.1.norm2.weight                 of shape (768,)
2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias               loaded from bert.encoder.tag_blocks.2.attn.proj.bias               of shape (768,)
2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight             loaded from bert.encoder.tag_blocks.2.attn.proj.weight             of shape (768, 768)
2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias                loaded from bert.encoder.tag_blocks.2.attn.qkv.bias                of shape (2304,)
2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight              loaded from bert.encoder.tag_blocks.2.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias                   loaded from bert.encoder.tag_blocks.2.norm1.bias                   of shape (768,)
2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight                 loaded from bert.encoder.tag_blocks.2.norm1.weight                 of shape (768,)
2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias                   loaded from bert.encoder.tag_blocks.2.norm2.bias                   of shape (768,)
2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight                 loaded from bert.encoder.tag_blocks.2.norm2.weight                 of shape (768,)
2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias               loaded from bert.encoder.tag_blocks.3.attn.proj.bias               of shape (768,)
2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight             loaded from bert.encoder.tag_blocks.3.attn.proj.weight             of shape (768, 768)
2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias                loaded from bert.encoder.tag_blocks.3.attn.qkv.bias                of shape (2304,)
2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight              loaded from bert.encoder.tag_blocks.3.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias                   loaded from bert.encoder.tag_blocks.3.norm1.bias                   of shape (768,)
2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight                 loaded from bert.encoder.tag_blocks.3.norm1.weight                 of shape (768,)
2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias                   loaded from bert.encoder.tag_blocks.3.norm2.bias                   of shape (768,)
2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight                 loaded from bert.encoder.tag_blocks.3.norm2.weight                 of shape (768,)
2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias                   loaded from bert.extra_embeddings.LayerNorm.bias                   of shape (768,)
2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight                 loaded from bert.extra_embeddings.LayerNorm.weight                 of shape (768,)
2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight       loaded from bert.extra_embeddings.position_embeddings.weight       of shape (512, 768)
2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight     loaded from bert.extra_embeddings.token_type_embeddings.weight     of shape (2, 768)
2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight           loaded from bert.extra_embeddings.word_embeddings.weight           of shape (30522, 768)
2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias                                 loaded from bert.pooler.dense.bias                                 of shape (768,)
2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight                               loaded from bert.pooler.dense.weight                               of shape (768, 768)
2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias                        loaded from bert.tag_logit.predictions.bias                        of shape (30522,)
2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight              loaded from bert.tag_logit.predictions.decoder.weight              of shape (30522, 768)
2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias    loaded from bert.tag_logit.predictions.transform.LayerNorm.bias    of shape (768,)
2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight  loaded from bert.tag_logit.predictions.transform.LayerNorm.weight  of shape (768,)
2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias        loaded from bert.tag_logit.predictions.transform.dense.bias        of shape (768,)
2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight      loaded from bert.tag_logit.predictions.transform.dense.weight      of shape (768, 768)
2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias                                   loaded from cls.predictions.bias                                   of shape (30522,)
2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight                         loaded from cls.predictions.decoder.weight                         of shape (30522, 768)
2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias               loaded from cls.predictions.transform.LayerNorm.bias               of shape (768,)
2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight             loaded from cls.predictions.transform.LayerNorm.weight             of shape (768,)
2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias                   loaded from cls.predictions.transform.dense.bias                   of shape (768,)
2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight                 loaded from cls.predictions.transform.dense.weight                 of shape (768, 768)
2022-03-17 13:37:58,173.173 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288
2022-03-17 13:37:58,173.173 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = []
2022-03-17 13:37:58,177.177 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1
2022-03-17 13:37:58,340.340 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight']
2022-03-17 13:37:58,669.669 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = <src.qd.data_layer.transform.LoadImage object at 0x7fcba4dc8710>; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False
2022-03-17 13:37:58,695.695 2829:uni_pipeline.py:509 get_data_loader(): sampler = <src.qd.data_layer.samplers.DistributedSampler object at 0x7fcc84799850>
2022-03-17 13:37:58,696.696 2829:uni_pipeline.py:1000    predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:37:58,768.768 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=<src.qd.data_layer.dataset.ImageIdxTSVDataset object at 0x7fcba4dc8490>, transform=Compose(
    <src.qd.data_layer.transform.LoadHW object at 0x7fcbcd9ba450>
    Compose(
    <src.qd.data_layer.transform.LoadImage object at 0x7fcba4dc8710>
    ImageTransform2Dict(image_transform=Compose(
    <src.qd.data_layer.transform.BGR2RGB object at 0x7fcba4dc8e90>
    ToPILImage()
    Resize(size=384, interpolation=PIL.Image.BICUBIC)
    CenterCrop(size=(384, 384))
    ToTensor()
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
))
)
    LoadLabel(data=TaxCocoCaption, split=test, version=vinvl)
    <src.qd.data_layer.transform.IdentifyTextAB object at 0x7fcc847ebe50>
    TransCaptionTensorizer(tensorizer=<src.qd.mask.data.datasets.caption_tensorizer.CaptionTensorizer object at 0x7fcbcd9b5710>, pad_to_max=True, pad_image_to_max=True)
    <src.qd.pipelines.tagger_caption_uni_pipeline_expanding.Tensorizer object at 0x7fcc84799990>
    <src.qd.data_layer.transform.RemoveUselessKeys object at 0x7fcc84799950>
    <src.qd.data_layer.transform.RenameKey object at 0x7fcc84799910>
))

uni_pipeline.py:908:   0%|          | 0/10 [00:00<?, ?it/s]/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1639: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

uni_pipeline.py:908:  10%|█         | 1/10 [00:05<00:50,  5.58s/it]
uni_pipeline.py:908:  20%|██        | 2/10 [00:10<00:41,  5.13s/it]
uni_pipeline.py:908:  30%|███       | 3/10 [00:16<00:38,  5.47s/it]
uni_pipeline.py:908:  40%|████      | 4/10 [00:21<00:31,  5.22s/it]
uni_pipeline.py:908:  50%|█████     | 5/10 [00:25<00:24,  4.96s/it]
uni_pipeline.py:908:  60%|██████    | 6/10 [00:30<00:19,  4.92s/it]
uni_pipeline.py:908:  70%|███████   | 7/10 [00:35<00:14,  4.89s/it]
uni_pipeline.py:908:  80%|████████  | 8/10 [00:40<00:09,  4.87s/it]
uni_pipeline.py:908:  90%|█████████ | 9/10 [00:45<00:04,  4.97s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:49<00:00,  4.73s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:50<00:00,  5.07s/it]
2022-03-17 13:38:50,531.531 2829:qd_common.py:625    cmd_run(): start to cmd run: nvidia-smi
2022-03-17 13:38:51,460.460 2829:qd_common.py:670    cmd_run(): finished the cmd run
2022-03-17 13:38:52,044.044 2829:uni_pipeline.py:1008    predict(): data: 0.0003 (0.0588)  input_to_cuda: 0.0029 (0.0030)  model: 4.8297 (4.9013)  write: 0.0043 (0.0044)
2022-03-17 13:39:06,798.798 2829:qd_common.py:1157 concat_files(): concating 0/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:39:06,799.799 2829:qd_common.py:1157 concat_files(): concating 1/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_1_32.tsv
2022-03-17 13:39:06,799.799 2829:qd_common.py:1157 concat_files(): concating 2/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_2_32.tsv
2022-03-17 13:39:06,800.800 2829:qd_common.py:1157 concat_files(): concating 3/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_3_32.tsv
2022-03-17 13:39:06,800.800 2829:qd_common.py:1157 concat_files(): concating 4/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_4_32.tsv
2022-03-17 13:39:06,801.801 2829:qd_common.py:1157 concat_files(): concating 5/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_5_32.tsv
2022-03-17 13:39:06,801.801 2829:qd_common.py:1157 concat_files(): concating 6/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_6_32.tsv
2022-03-17 13:39:06,802.802 2829:qd_common.py:1157 concat_files(): concating 7/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_7_32.tsv
2022-03-17 13:39:06,802.802 2829:qd_common.py:1157 concat_files(): concating 8/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_8_32.tsv
2022-03-17 13:39:06,873.873 2829:qd_common.py:1157 concat_files(): concating 9/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_9_32.tsv
2022-03-17 13:39:06,948.948 2829:qd_common.py:1157 concat_files(): concating 10/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_10_32.tsv
2022-03-17 13:39:07,019.019 2829:qd_common.py:1157 concat_files(): concating 11/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_11_32.tsv
2022-03-17 13:39:07,091.091 2829:qd_common.py:1157 concat_files(): concating 12/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_12_32.tsv
2022-03-17 13:39:07,162.162 2829:qd_common.py:1157 concat_files(): concating 13/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_13_32.tsv
2022-03-17 13:39:07,233.233 2829:qd_common.py:1157 concat_files(): concating 14/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_14_32.tsv
2022-03-17 13:39:07,304.304 2829:qd_common.py:1157 concat_files(): concating 15/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_15_32.tsv
2022-03-17 13:39:07,374.374 2829:qd_common.py:1157 concat_files(): concating 16/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_16_32.tsv
2022-03-17 13:39:07,445.445 2829:qd_common.py:1157 concat_files(): concating 17/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_17_32.tsv
2022-03-17 13:39:07,517.517 2829:qd_common.py:1157 concat_files(): concating 18/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_18_32.tsv
2022-03-17 13:39:07,587.587 2829:qd_common.py:1157 concat_files(): concating 19/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_19_32.tsv
2022-03-17 13:39:07,659.659 2829:qd_common.py:1157 concat_files(): concating 20/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_20_32.tsv
2022-03-17 13:39:07,730.730 2829:qd_common.py:1157 concat_files(): concating 21/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_21_32.tsv
2022-03-17 13:39:07,802.802 2829:qd_common.py:1157 concat_files(): concating 22/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_22_32.tsv
2022-03-17 13:39:07,873.873 2829:qd_common.py:1157 concat_files(): concating 23/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_23_32.tsv
2022-03-17 13:39:07,944.944 2829:qd_common.py:1157 concat_files(): concating 24/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_24_32.tsv
2022-03-17 13:39:08,015.015 2829:qd_common.py:1157 concat_files(): concating 25/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_25_32.tsv
2022-03-17 13:39:08,087.087 2829:qd_common.py:1157 concat_files(): concating 26/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_26_32.tsv
2022-03-17 13:39:08,157.157 2829:qd_common.py:1157 concat_files(): concating 27/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_27_32.tsv
2022-03-17 13:39:08,228.228 2829:qd_common.py:1157 concat_files(): concating 28/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_28_32.tsv
2022-03-17 13:39:08,299.299 2829:qd_common.py:1157 concat_files(): concating 29/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_29_32.tsv
2022-03-17 13:39:08,371.371 2829:qd_common.py:1157 concat_files(): concating 30/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_30_32.tsv
2022-03-17 13:39:08,442.442 2829:qd_common.py:1157 concat_files(): concating 31/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_31_32.tsv
2022-03-17 13:39:10,706.706 2829:tsv_io.py:66 reorder_tsv_keys(): loading keys in input
2022-03-17 13:39:10,732.732 2829:tsv_io.py:328 _ensure_lineidx_loaded(): loaded 5024 from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv.before.reorder.lineidx

tsv_io.py:67:   0%|          | 0/5024 [00:00<?, ?it/s]
tsv_io.py:67: 100%|██████████| 5024/5024 [00:00<00:00, 101650.67it/s]
2022-03-17 13:39:10,854.854 2829:tsv_io.py:70   gen_rows(): writing

tsv_io.py:71:   0%|          | 0/5000 [00:00<?, ?it/s]
tsv_io.py:71: 100%|██████████| 5000/5000 [00:00<00:00, 71845.12it/s]
2022-03-17 13:39:27,846.846 2829:qd_common.py:625    cmd_run(): start to cmd run: rsync output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv.speed.vis.txt.tmp --progress
model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt

         32,768  33%    0.00kB/s    0:00:00  
         98,002 100%   62.21MB/s    0:00:00 (xfr#1, to-chk=0/1)
loading annotations into memory...
0:00:00.022312
creating index...
index created!
Loading and preparing results...     
DONE (t=0.01s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 307086 tokens at 1431210.88 tokens per second.
PTBTokenizer tokenized 58225 tokens at 486467.97 tokens per second.
setting up scorers...
computing Bleu score...
{'testlen': 47877, 'reflen': 47641, 'guess': [47877, 42877, 37877, 32877], 'correct': [36153, 19876, 9468, 4378]}
ratio: 1.0049537163367477
Bleu_1: 0.755
Bleu_2: 0.592
Bleu_3: 0.444
Bleu_4: 0.329
computing METEOR score...
METEOR: 0.267
computing Rouge score...
ROUGE_L: 0.554
computing CIDEr score...
CIDEr: 1.074
computing SPICE score...
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.nustaq.serialization.FSTClazzInfo (file:/mnt/batch/tasks/shared/LS_root/jobs/vlp32gb/azureml/zhiyuan-pytorch-test_1647405376_841ce8bd/mounts/vig_data/jianfw/data/qd_data/coco_caption/pycocoevalcap/spice/lib/fst-2.47.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of org.nustaq.serialization.FSTClazzInfo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Parsing reference captions
Parsing test captions
Initiating Stanford parsing pipeline
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... 
done [0.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [0.9 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.4 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].
Threads( StanfordCoreNLP ) [3.671 seconds]
Warning: Nashorn engine is planned to be removed from a future JDK release
SPICE evaluation took: 17.35 s
SPICE: 0.201
2022-03-17 13:40:07,642.642 2829:tagger_caption_uni_pipeline_expanding.py:1010   evaluate(): evaluation result: {'Bleu_1': 0.755122501409847, 'Bleu_2': 0.5916447352385894, 'Bleu_3': 0.44395085285643654, 'Bleu_4': 0.328546666316975, 'METEOR': 0.2670375550538535, 'ROUGE_L': 0.5536701198379791, 'CIDEr': 1.0741489556083088, 'SPICE': 0.20138576177219544}
2022-03-17 13:40:07,643.643 2829:tagger_caption_uni_pipeline_expanding.py:1011   evaluate(): evaluation result saved to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.report
2022-03-17 13:40:07,668.668 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json
2022-03-17 13:40:07,668.668 2829:modeling_utils.py:211 from_pretrained(): Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "image_captioning",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "TIMM_vit",
  "net": "vit_base_patch16_384",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pretrained": true,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
2022-03-17 13:40:08,351.351 2829:modeling_bert.py:529   __init__(): TIMM Split image encoder load from pre-trained: True
2022-03-17 13:40:09,415.415 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:40:10,941.941 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:40:11,597.597 2829:modeling_bert.py:2677   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:40:11,745.745 2829:modeling_bert.py:2688   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:40:12,537.537 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight!  Note that this might be replaced by pre-trained checkpoint later!
2022-03-17 13:40:12,537.537 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode.
2022-03-17 13:40:13,674.674 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:40:14,218.218 2829:checkpoint.py:240       load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt
2022-03-17 13:40:22,393.393 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token                                loaded from image_encoder.module.cls_token                         of shape (1, 1, 768)
2022-03-17 13:40:22,393.393 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias                                loaded from image_encoder.module.head.bias                         of shape (1000,)
2022-03-17 13:40:22,393.393 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight                              loaded from image_encoder.module.head.weight                       of shape (1000, 768)
2022-03-17 13:40:22,393.393 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias                    loaded from image_encoder.module.patch_embed.proj.bias             of shape (768,)
2022-03-17 13:40:22,393.393 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight                  loaded from image_encoder.module.patch_embed.proj.weight           of shape (768, 3, 16, 16)
2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed                                loaded from image_encoder.module.pos_embed                         of shape (1, 577, 768)
2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias                         loaded from bert.caption_pooler.dense.bias                         of shape (768,)
2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight                       loaded from bert.caption_pooler.dense.weight                       of shape (768, 768)
2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias       loaded from bert.decoder.layer.0.attention.output.dense.bias       of shape (768,)
2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight     loaded from bert.decoder.layer.0.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias           loaded from bert.decoder.layer.0.attention.self.key.bias           of shape (768,)
2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight         loaded from bert.decoder.layer.0.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias         loaded from bert.decoder.layer.0.attention.self.query.bias         of shape (768,)
2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight       loaded from bert.decoder.layer.0.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias         loaded from bert.decoder.layer.0.attention.self.value.bias         of shape (768,)
2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight       loaded from bert.decoder.layer.0.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias           loaded from bert.decoder.layer.0.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight         loaded from bert.decoder.layer.0.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias             loaded from bert.decoder.layer.0.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight           loaded from bert.decoder.layer.0.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias                 loaded from bert.decoder.layer.0.output.dense.bias                 of shape (768,)
2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight               loaded from bert.decoder.layer.0.output.dense.weight               of shape (768, 3072)
2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias       loaded from bert.decoder.layer.1.attention.output.dense.bias       of shape (768,)
2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight     loaded from bert.decoder.layer.1.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias           loaded from bert.decoder.layer.1.attention.self.key.bias           of shape (768,)
2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight         loaded from bert.decoder.layer.1.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias         loaded from bert.decoder.layer.1.attention.self.query.bias         of shape (768,)
2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight       loaded from bert.decoder.layer.1.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias         loaded from bert.decoder.layer.1.attention.self.value.bias         of shape (768,)
2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight       loaded from bert.decoder.layer.1.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias           loaded from bert.decoder.layer.1.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight         loaded from bert.decoder.layer.1.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias             loaded from bert.decoder.layer.1.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight           loaded from bert.decoder.layer.1.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias                 loaded from bert.decoder.layer.1.output.dense.bias                 of shape (768,)
2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight               loaded from bert.decoder.layer.1.output.dense.weight               of shape (768, 3072)
2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias       loaded from bert.decoder.layer.2.attention.output.dense.bias       of shape (768,)
2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight     loaded from bert.decoder.layer.2.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias           loaded from bert.decoder.layer.2.attention.self.key.bias           of shape (768,)
2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight         loaded from bert.decoder.layer.2.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias         loaded from bert.decoder.layer.2.attention.self.query.bias         of shape (768,)
2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight       loaded from bert.decoder.layer.2.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias         loaded from bert.decoder.layer.2.attention.self.value.bias         of shape (768,)
2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight       loaded from bert.decoder.layer.2.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias           loaded from bert.decoder.layer.2.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight         loaded from bert.decoder.layer.2.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias             loaded from bert.decoder.layer.2.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight           loaded from bert.decoder.layer.2.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias                 loaded from bert.decoder.layer.2.output.dense.bias                 of shape (768,)
2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight               loaded from bert.decoder.layer.2.output.dense.weight               of shape (768, 3072)
2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias       loaded from bert.decoder.layer.3.attention.output.dense.bias       of shape (768,)
2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight     loaded from bert.decoder.layer.3.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias           loaded from bert.decoder.layer.3.attention.self.key.bias           of shape (768,)
2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight         loaded from bert.decoder.layer.3.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias         loaded from bert.decoder.layer.3.attention.self.query.bias         of shape (768,)
2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight       loaded from bert.decoder.layer.3.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias         loaded from bert.decoder.layer.3.attention.self.value.bias         of shape (768,)
2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight       loaded from bert.decoder.layer.3.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias           loaded from bert.decoder.layer.3.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight         loaded from bert.decoder.layer.3.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias             loaded from bert.decoder.layer.3.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight           loaded from bert.decoder.layer.3.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias                 loaded from bert.decoder.layer.3.output.dense.bias                 of shape (768,)
2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight               loaded from bert.decoder.layer.3.output.dense.weight               of shape (768, 3072)
2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias                         loaded from bert.embeddings.LayerNorm.bias                         of shape (768,)
2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight                       loaded from bert.embeddings.LayerNorm.weight                       of shape (768,)
2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight             loaded from bert.embeddings.position_embeddings.weight             of shape (512, 768)
2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight           loaded from bert.embeddings.token_type_embeddings.weight           of shape (2, 768)
2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight                 loaded from bert.embeddings.word_embeddings.weight                 of shape (30522, 768)
2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias                   loaded from bert.encoder.blocks.0.attn.proj.bias                   of shape (768,)
2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight                 loaded from bert.encoder.blocks.0.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias                    loaded from bert.encoder.blocks.0.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight                  loaded from bert.encoder.blocks.0.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias                     loaded from bert.encoder.blocks.0.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight                   loaded from bert.encoder.blocks.0.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias                     loaded from bert.encoder.blocks.0.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight                   loaded from bert.encoder.blocks.0.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias                       loaded from bert.encoder.blocks.0.norm1.bias                       of shape (768,)
2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight                     loaded from bert.encoder.blocks.0.norm1.weight                     of shape (768,)
2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias                       loaded from bert.encoder.blocks.0.norm2.bias                       of shape (768,)
2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight                     loaded from bert.encoder.blocks.0.norm2.weight                     of shape (768,)
2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias                   loaded from bert.encoder.blocks.1.attn.proj.bias                   of shape (768,)
2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight                 loaded from bert.encoder.blocks.1.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias                    loaded from bert.encoder.blocks.1.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight                  loaded from bert.encoder.blocks.1.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias                     loaded from bert.encoder.blocks.1.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight                   loaded from bert.encoder.blocks.1.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias                     loaded from bert.encoder.blocks.1.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight                   loaded from bert.encoder.blocks.1.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias                       loaded from bert.encoder.blocks.1.norm1.bias                       of shape (768,)
2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight                     loaded from bert.encoder.blocks.1.norm1.weight                     of shape (768,)
2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias                       loaded from bert.encoder.blocks.1.norm2.bias                       of shape (768,)
2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight                     loaded from bert.encoder.blocks.1.norm2.weight                     of shape (768,)
2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias                  loaded from bert.encoder.blocks.10.attn.proj.bias                  of shape (768,)
2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight                loaded from bert.encoder.blocks.10.attn.proj.weight                of shape (768, 768)
2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias                   loaded from bert.encoder.blocks.10.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight                 loaded from bert.encoder.blocks.10.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias                    loaded from bert.encoder.blocks.10.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight                  loaded from bert.encoder.blocks.10.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias                    loaded from bert.encoder.blocks.10.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight                  loaded from bert.encoder.blocks.10.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias                      loaded from bert.encoder.blocks.10.norm1.bias                      of shape (768,)
2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight                    loaded from bert.encoder.blocks.10.norm1.weight                    of shape (768,)
2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias                      loaded from bert.encoder.blocks.10.norm2.bias                      of shape (768,)
2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight                    loaded from bert.encoder.blocks.10.norm2.weight                    of shape (768,)
2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias                  loaded from bert.encoder.blocks.11.attn.proj.bias                  of shape (768,)
2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight                loaded from bert.encoder.blocks.11.attn.proj.weight                of shape (768, 768)
2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias                   loaded from bert.encoder.blocks.11.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight                 loaded from bert.encoder.blocks.11.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias                    loaded from bert.encoder.blocks.11.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight                  loaded from bert.encoder.blocks.11.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias                    loaded from bert.encoder.blocks.11.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight                  loaded from bert.encoder.blocks.11.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias                      loaded from bert.encoder.blocks.11.norm1.bias                      of shape (768,)
2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight                    loaded from bert.encoder.blocks.11.norm1.weight                    of shape (768,)
2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias                      loaded from bert.encoder.blocks.11.norm2.bias                      of shape (768,)
2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight                    loaded from bert.encoder.blocks.11.norm2.weight                    of shape (768,)
2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias                   loaded from bert.encoder.blocks.2.attn.proj.bias                   of shape (768,)
2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight                 loaded from bert.encoder.blocks.2.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias                    loaded from bert.encoder.blocks.2.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight                  loaded from bert.encoder.blocks.2.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias                     loaded from bert.encoder.blocks.2.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight                   loaded from bert.encoder.blocks.2.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias                     loaded from bert.encoder.blocks.2.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight                   loaded from bert.encoder.blocks.2.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias                       loaded from bert.encoder.blocks.2.norm1.bias                       of shape (768,)
2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight                     loaded from bert.encoder.blocks.2.norm1.weight                     of shape (768,)
2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias                       loaded from bert.encoder.blocks.2.norm2.bias                       of shape (768,)
2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight                     loaded from bert.encoder.blocks.2.norm2.weight                     of shape (768,)
2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias                   loaded from bert.encoder.blocks.3.attn.proj.bias                   of shape (768,)
2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight                 loaded from bert.encoder.blocks.3.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias                    loaded from bert.encoder.blocks.3.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight                  loaded from bert.encoder.blocks.3.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias                     loaded from bert.encoder.blocks.3.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight                   loaded from bert.encoder.blocks.3.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias                     loaded from bert.encoder.blocks.3.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight                   loaded from bert.encoder.blocks.3.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias                       loaded from bert.encoder.blocks.3.norm1.bias                       of shape (768,)
2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight                     loaded from bert.encoder.blocks.3.norm1.weight                     of shape (768,)
2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias                       loaded from bert.encoder.blocks.3.norm2.bias                       of shape (768,)
2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight                     loaded from bert.encoder.blocks.3.norm2.weight                     of shape (768,)
2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias                   loaded from bert.encoder.blocks.4.attn.proj.bias                   of shape (768,)
2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight                 loaded from bert.encoder.blocks.4.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias                    loaded from bert.encoder.blocks.4.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight                  loaded from bert.encoder.blocks.4.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias                     loaded from bert.encoder.blocks.4.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight                   loaded from bert.encoder.blocks.4.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias                     loaded from bert.encoder.blocks.4.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight                   loaded from bert.encoder.blocks.4.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias                       loaded from bert.encoder.blocks.4.norm1.bias                       of shape (768,)
2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight                     loaded from bert.encoder.blocks.4.norm1.weight                     of shape (768,)
2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias                       loaded from bert.encoder.blocks.4.norm2.bias                       of shape (768,)
2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight                     loaded from bert.encoder.blocks.4.norm2.weight                     of shape (768,)
2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias                   loaded from bert.encoder.blocks.5.attn.proj.bias                   of shape (768,)
2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight                 loaded from bert.encoder.blocks.5.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias                    loaded from bert.encoder.blocks.5.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight                  loaded from bert.encoder.blocks.5.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias                     loaded from bert.encoder.blocks.5.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight                   loaded from bert.encoder.blocks.5.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias                     loaded from bert.encoder.blocks.5.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight                   loaded from bert.encoder.blocks.5.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias                       loaded from bert.encoder.blocks.5.norm1.bias                       of shape (768,)
2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight                     loaded from bert.encoder.blocks.5.norm1.weight                     of shape (768,)
2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias                       loaded from bert.encoder.blocks.5.norm2.bias                       of shape (768,)
2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight                     loaded from bert.encoder.blocks.5.norm2.weight                     of shape (768,)
2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias                   loaded from bert.encoder.blocks.6.attn.proj.bias                   of shape (768,)
2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight                 loaded from bert.encoder.blocks.6.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias                    loaded from bert.encoder.blocks.6.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight                  loaded from bert.encoder.blocks.6.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias                     loaded from bert.encoder.blocks.6.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight                   loaded from bert.encoder.blocks.6.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias                     loaded from bert.encoder.blocks.6.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight                   loaded from bert.encoder.blocks.6.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias                       loaded from bert.encoder.blocks.6.norm1.bias                       of shape (768,)
2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight                     loaded from bert.encoder.blocks.6.norm1.weight                     of shape (768,)
2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias                       loaded from bert.encoder.blocks.6.norm2.bias                       of shape (768,)
2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight                     loaded from bert.encoder.blocks.6.norm2.weight                     of shape (768,)
2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias                   loaded from bert.encoder.blocks.7.attn.proj.bias                   of shape (768,)
2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight                 loaded from bert.encoder.blocks.7.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias                    loaded from bert.encoder.blocks.7.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight                  loaded from bert.encoder.blocks.7.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias                     loaded from bert.encoder.blocks.7.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight                   loaded from bert.encoder.blocks.7.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias                     loaded from bert.encoder.blocks.7.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight                   loaded from bert.encoder.blocks.7.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias                       loaded from bert.encoder.blocks.7.norm1.bias                       of shape (768,)
2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight                     loaded from bert.encoder.blocks.7.norm1.weight                     of shape (768,)
2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias                       loaded from bert.encoder.blocks.7.norm2.bias                       of shape (768,)
2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight                     loaded from bert.encoder.blocks.7.norm2.weight                     of shape (768,)
2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias                   loaded from bert.encoder.blocks.8.attn.proj.bias                   of shape (768,)
2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight                 loaded from bert.encoder.blocks.8.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias                    loaded from bert.encoder.blocks.8.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight                  loaded from bert.encoder.blocks.8.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias                     loaded from bert.encoder.blocks.8.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight                   loaded from bert.encoder.blocks.8.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias                     loaded from bert.encoder.blocks.8.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight                   loaded from bert.encoder.blocks.8.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias                       loaded from bert.encoder.blocks.8.norm1.bias                       of shape (768,)
2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight                     loaded from bert.encoder.blocks.8.norm1.weight                     of shape (768,)
2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias                       loaded from bert.encoder.blocks.8.norm2.bias                       of shape (768,)
2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight                     loaded from bert.encoder.blocks.8.norm2.weight                     of shape (768,)
2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias                   loaded from bert.encoder.blocks.9.attn.proj.bias                   of shape (768,)
2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight                 loaded from bert.encoder.blocks.9.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias                    loaded from bert.encoder.blocks.9.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight                  loaded from bert.encoder.blocks.9.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias                     loaded from bert.encoder.blocks.9.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight                   loaded from bert.encoder.blocks.9.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias                     loaded from bert.encoder.blocks.9.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight                   loaded from bert.encoder.blocks.9.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias                       loaded from bert.encoder.blocks.9.norm1.bias                       of shape (768,)
2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight                     loaded from bert.encoder.blocks.9.norm1.weight                     of shape (768,)
2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias                       loaded from bert.encoder.blocks.9.norm2.bias                       of shape (768,)
2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight                     loaded from bert.encoder.blocks.9.norm2.weight                     of shape (768,)
2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias               loaded from bert.encoder.tag_blocks.0.attn.proj.bias               of shape (768,)
2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight             loaded from bert.encoder.tag_blocks.0.attn.proj.weight             of shape (768, 768)
2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias                loaded from bert.encoder.tag_blocks.0.attn.qkv.bias                of shape (2304,)
2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight              loaded from bert.encoder.tag_blocks.0.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias                   loaded from bert.encoder.tag_blocks.0.norm1.bias                   of shape (768,)
2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight                 loaded from bert.encoder.tag_blocks.0.norm1.weight                 of shape (768,)
2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias                   loaded from bert.encoder.tag_blocks.0.norm2.bias                   of shape (768,)
2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight                 loaded from bert.encoder.tag_blocks.0.norm2.weight                 of shape (768,)
2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias               loaded from bert.encoder.tag_blocks.1.attn.proj.bias               of shape (768,)
2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight             loaded from bert.encoder.tag_blocks.1.attn.proj.weight             of shape (768, 768)
2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias                loaded from bert.encoder.tag_blocks.1.attn.qkv.bias                of shape (2304,)
2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight              loaded from bert.encoder.tag_blocks.1.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias                   loaded from bert.encoder.tag_blocks.1.norm1.bias                   of shape (768,)
2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight                 loaded from bert.encoder.tag_blocks.1.norm1.weight                 of shape (768,)
2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias                   loaded from bert.encoder.tag_blocks.1.norm2.bias                   of shape (768,)
2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight                 loaded from bert.encoder.tag_blocks.1.norm2.weight                 of shape (768,)
2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias               loaded from bert.encoder.tag_blocks.2.attn.proj.bias               of shape (768,)
2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight             loaded from bert.encoder.tag_blocks.2.attn.proj.weight             of shape (768, 768)
2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias                loaded from bert.encoder.tag_blocks.2.attn.qkv.bias                of shape (2304,)
2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight              loaded from bert.encoder.tag_blocks.2.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias                   loaded from bert.encoder.tag_blocks.2.norm1.bias                   of shape (768,)
2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight                 loaded from bert.encoder.tag_blocks.2.norm1.weight                 of shape (768,)
2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias                   loaded from bert.encoder.tag_blocks.2.norm2.bias                   of shape (768,)
2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight                 loaded from bert.encoder.tag_blocks.2.norm2.weight                 of shape (768,)
2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias               loaded from bert.encoder.tag_blocks.3.attn.proj.bias               of shape (768,)
2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight             loaded from bert.encoder.tag_blocks.3.attn.proj.weight             of shape (768, 768)
2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias                loaded from bert.encoder.tag_blocks.3.attn.qkv.bias                of shape (2304,)
2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight              loaded from bert.encoder.tag_blocks.3.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias                   loaded from bert.encoder.tag_blocks.3.norm1.bias                   of shape (768,)
2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight                 loaded from bert.encoder.tag_blocks.3.norm1.weight                 of shape (768,)
2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias                   loaded from bert.encoder.tag_blocks.3.norm2.bias                   of shape (768,)
2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight                 loaded from bert.encoder.tag_blocks.3.norm2.weight                 of shape (768,)
2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias                   loaded from bert.extra_embeddings.LayerNorm.bias                   of shape (768,)
2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight                 loaded from bert.extra_embeddings.LayerNorm.weight                 of shape (768,)
2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight       loaded from bert.extra_embeddings.position_embeddings.weight       of shape (512, 768)
2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight     loaded from bert.extra_embeddings.token_type_embeddings.weight     of shape (2, 768)
2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight           loaded from bert.extra_embeddings.word_embeddings.weight           of shape (30522, 768)
2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias                                 loaded from bert.pooler.dense.bias                                 of shape (768,)
2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight                               loaded from bert.pooler.dense.weight                               of shape (768, 768)
2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias                        loaded from bert.tag_logit.predictions.bias                        of shape (30522,)
2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight              loaded from bert.tag_logit.predictions.decoder.weight              of shape (30522, 768)
2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias    loaded from bert.tag_logit.predictions.transform.LayerNorm.bias    of shape (768,)
2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight  loaded from bert.tag_logit.predictions.transform.LayerNorm.weight  of shape (768,)
2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias        loaded from bert.tag_logit.predictions.transform.dense.bias        of shape (768,)
2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight      loaded from bert.tag_logit.predictions.transform.dense.weight      of shape (768, 768)
2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias                                   loaded from cls.predictions.bias                                   of shape (30522,)
2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight                         loaded from cls.predictions.decoder.weight                         of shape (30522, 768)
2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias               loaded from cls.predictions.transform.LayerNorm.bias               of shape (768,)
2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight             loaded from cls.predictions.transform.LayerNorm.weight             of shape (768,)
2022-03-17 13:40:22,415.415 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias                   loaded from cls.predictions.transform.dense.bias                   of shape (768,)
2022-03-17 13:40:22,415.415 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight                 loaded from cls.predictions.transform.dense.weight                 of shape (768, 768)
2022-03-17 13:40:22,415.415 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288
2022-03-17 13:40:22,415.415 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = []
2022-03-17 13:40:22,418.418 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1
2022-03-17 13:40:22,582.582 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight']
2022-03-17 13:40:22,906.906 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = <src.qd.data_layer.transform.LoadImage object at 0x7fcbc0436650>; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False
2022-03-17 13:40:22,931.931 2829:uni_pipeline.py:509 get_data_loader(): sampler = <src.qd.data_layer.samplers.DistributedSampler object at 0x7fcbcd69e790>
2022-03-17 13:40:22,931.931 2829:uni_pipeline.py:1000    predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:40:23,003.003 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=<src.qd.data_layer.dataset.ImageIdxTSVDataset object at 0x7fcbc0436510>, transform=Compose(
    <src.qd.data_layer.transform.LoadHW object at 0x7fcbc0436590>
    Compose(
    <src.qd.data_layer.transform.LoadImage object at 0x7fcbc0436650>
    ImageTransform2Dict(image_transform=Compose(
    <src.qd.data_layer.transform.BGR2RGB object at 0x7fcbcd69e0d0>
    ToPILImage()
    Resize(size=384, interpolation=PIL.Image.BICUBIC)
    CenterCrop(size=(384, 384))
    ToTensor()
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
))
)
    LoadLabel(data=TaxCocoCaption, split=test, version=vinvl)
    <src.qd.data_layer.transform.IdentifyTextAB object at 0x7fcbcd69e650>
    TransCaptionTensorizer(tensorizer=<src.qd.mask.data.datasets.caption_tensorizer.CaptionTensorizer object at 0x7fcbcd9b5710>, pad_to_max=True, pad_image_to_max=True)
    <src.qd.pipelines.tagger_caption_uni_pipeline_expanding.Tensorizer object at 0x7fcc84717050>
    <src.qd.data_layer.transform.RemoveUselessKeys object at 0x7fcc847ebf50>
    <src.qd.data_layer.transform.RenameKey object at 0x7fcc847ebfd0>
))

uni_pipeline.py:908:   0%|          | 0/10 [00:00<?, ?it/s]/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1639: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

uni_pipeline.py:908:  10%|█         | 1/10 [00:05<00:46,  5.19s/it]
uni_pipeline.py:908:  20%|██        | 2/10 [00:10<00:43,  5.39s/it]
uni_pipeline.py:908:  30%|███       | 3/10 [00:17<00:41,  5.93s/it]
uni_pipeline.py:908:  40%|████      | 4/10 [00:22<00:33,  5.63s/it]
uni_pipeline.py:908:  50%|█████     | 5/10 [00:28<00:29,  5.85s/it]
uni_pipeline.py:908:  60%|██████    | 6/10 [00:33<00:22,  5.51s/it]
uni_pipeline.py:908:  70%|███████   | 7/10 [00:38<00:16,  5.40s/it]
uni_pipeline.py:908:  80%|████████  | 8/10 [00:44<00:10,  5.45s/it]
uni_pipeline.py:908:  90%|█████████ | 9/10 [00:49<00:05,  5.48s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:54<00:00,  5.16s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:55<00:00,  5.56s/it]
2022-03-17 13:41:19,547.547 2829:qd_common.py:625    cmd_run(): start to cmd run: nvidia-smi
2022-03-17 13:41:20,632.632 2829:qd_common.py:670    cmd_run(): finished the cmd run
2022-03-17 13:41:21,278.278 2829:uni_pipeline.py:1008    predict(): data: 0.0003 (0.0534)  input_to_cuda: 0.0029 (0.0031)  model: 5.1849 (5.3857)  write: 0.0045 (0.0045)
2022-03-17 13:41:31,924.924 2829:qd_common.py:1157 concat_files(): concating 0/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:41:31,925.925 2829:qd_common.py:1157 concat_files(): concating 1/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_1_32.tsv
2022-03-17 13:41:31,925.925 2829:qd_common.py:1157 concat_files(): concating 2/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_2_32.tsv
2022-03-17 13:41:31,926.926 2829:qd_common.py:1157 concat_files(): concating 3/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_3_32.tsv
2022-03-17 13:41:31,926.926 2829:qd_common.py:1157 concat_files(): concating 4/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_4_32.tsv
2022-03-17 13:41:31,927.927 2829:qd_common.py:1157 concat_files(): concating 5/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_5_32.tsv
2022-03-17 13:41:31,927.927 2829:qd_common.py:1157 concat_files(): concating 6/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_6_32.tsv
2022-03-17 13:41:31,928.928 2829:qd_common.py:1157 concat_files(): concating 7/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_7_32.tsv
2022-03-17 13:41:31,928.928 2829:qd_common.py:1157 concat_files(): concating 8/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_8_32.tsv
2022-03-17 13:41:32,001.001 2829:qd_common.py:1157 concat_files(): concating 9/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_9_32.tsv
2022-03-17 13:41:32,072.072 2829:qd_common.py:1157 concat_files(): concating 10/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_10_32.tsv
2022-03-17 13:41:32,144.144 2829:qd_common.py:1157 concat_files(): concating 11/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_11_32.tsv
2022-03-17 13:41:32,214.214 2829:qd_common.py:1157 concat_files(): concating 12/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_12_32.tsv
2022-03-17 13:41:32,286.286 2829:qd_common.py:1157 concat_files(): concating 13/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_13_32.tsv
2022-03-17 13:41:32,358.358 2829:qd_common.py:1157 concat_files(): concating 14/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_14_32.tsv
2022-03-17 13:41:32,429.429 2829:qd_common.py:1157 concat_files(): concating 15/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_15_32.tsv
2022-03-17 13:41:32,500.500 2829:qd_common.py:1157 concat_files(): concating 16/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_16_32.tsv
2022-03-17 13:41:32,571.571 2829:qd_common.py:1157 concat_files(): concating 17/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_17_32.tsv
2022-03-17 13:41:32,642.642 2829:qd_common.py:1157 concat_files(): concating 18/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_18_32.tsv
2022-03-17 13:41:32,713.713 2829:qd_common.py:1157 concat_files(): concating 19/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_19_32.tsv
2022-03-17 13:41:32,784.784 2829:qd_common.py:1157 concat_files(): concating 20/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_20_32.tsv
2022-03-17 13:41:32,856.856 2829:qd_common.py:1157 concat_files(): concating 21/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_21_32.tsv
2022-03-17 13:41:32,927.927 2829:qd_common.py:1157 concat_files(): concating 22/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_22_32.tsv
2022-03-17 13:41:32,999.999 2829:qd_common.py:1157 concat_files(): concating 23/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_23_32.tsv
2022-03-17 13:41:33,070.070 2829:qd_common.py:1157 concat_files(): concating 24/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_24_32.tsv
2022-03-17 13:41:33,141.141 2829:qd_common.py:1157 concat_files(): concating 25/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_25_32.tsv
2022-03-17 13:41:33,212.212 2829:qd_common.py:1157 concat_files(): concating 26/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_26_32.tsv
2022-03-17 13:41:33,283.283 2829:qd_common.py:1157 concat_files(): concating 27/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_27_32.tsv
2022-03-17 13:41:33,354.354 2829:qd_common.py:1157 concat_files(): concating 28/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_28_32.tsv
2022-03-17 13:41:33,426.426 2829:qd_common.py:1157 concat_files(): concating 29/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_29_32.tsv
2022-03-17 13:41:33,496.496 2829:qd_common.py:1157 concat_files(): concating 30/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_30_32.tsv
2022-03-17 13:41:33,567.567 2829:qd_common.py:1157 concat_files(): concating 31/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_31_32.tsv
2022-03-17 13:41:35,832.832 2829:tsv_io.py:66 reorder_tsv_keys(): loading keys in input
2022-03-17 13:41:35,858.858 2829:tsv_io.py:328 _ensure_lineidx_loaded(): loaded 5024 from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv.before.reorder.lineidx

tsv_io.py:67:   0%|          | 0/5024 [00:00<?, ?it/s]
tsv_io.py:67: 100%|██████████| 5024/5024 [00:00<00:00, 102245.00it/s]
2022-03-17 13:41:35,981.981 2829:tsv_io.py:70   gen_rows(): writing

tsv_io.py:71:   0%|          | 0/5000 [00:00<?, ?it/s]
tsv_io.py:71: 100%|██████████| 5000/5000 [00:00<00:00, 71863.59it/s]
2022-03-17 13:41:53,029.029 2829:qd_common.py:625    cmd_run(): start to cmd run: rsync output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv.speed.vis.txt.tmp --progress
model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt

         32,768  33%    0.00kB/s    0:00:00  
         98,212 100%   62.41MB/s    0:00:00 (xfr#1, to-chk=0/1)
loading annotations into memory...
0:00:00.022166
creating index...
index created!
Loading and preparing results...     
DONE (t=0.01s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 307086 tokens at 1420876.89 tokens per second.
PTBTokenizer tokenized 59418 tokens at 511403.53 tokens per second.
setting up scorers...
computing Bleu score...
{'testlen': 49376, 'reflen': 48498, 'guess': [49376, 44376, 39376, 34376], 'correct': [37373, 20445, 9775, 4451]}
ratio: 1.0181038393335597
Bleu_1: 0.757
Bleu_2: 0.591
Bleu_3: 0.442
Bleu_4: 0.325
computing METEOR score...
METEOR: 0.272
computing Rouge score...
ROUGE_L: 0.557
computing CIDEr score...
CIDEr: 1.106
computing SPICE score...
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.nustaq.serialization.FSTClazzInfo (file:/mnt/batch/tasks/shared/LS_root/jobs/vlp32gb/azureml/zhiyuan-pytorch-test_1647405376_841ce8bd/mounts/vig_data/jianfw/data/qd_data/coco_caption/pycocoevalcap/spice/lib/fst-2.47.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of org.nustaq.serialization.FSTClazzInfo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Parsing reference captions
Parsing test captions
Initiating Stanford parsing pipeline
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... 
done [0.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [0.9 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.4 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].
Threads( StanfordCoreNLP ) [3.804 seconds]
Warning: Nashorn engine is planned to be removed from a future JDK release
SPICE evaluation took: 17.33 s
SPICE: 0.208
2022-03-17 13:42:32,825.825 2829:tagger_caption_uni_pipeline_expanding.py:1010   evaluate(): evaluation result: {'Bleu_1': 0.7569061892417216, 'Bleu_2': 0.5905280199327813, 'Bleu_3': 0.4423731194130878, 'Bleu_4': 0.32538094074362695, 'METEOR': 0.27201376106521763, 'ROUGE_L': 0.5569138745901401, 'CIDEr': 1.1063124452924422, 'SPICE': 0.20786210563501437}
2022-03-17 13:42:32,826.826 2829:tagger_caption_uni_pipeline_expanding.py:1011   evaluate(): evaluation result saved to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.report
2022-03-17 13:42:32,851.851 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json
2022-03-17 13:42:32,852.852 2829:modeling_utils.py:211 from_pretrained(): Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "image_captioning",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "TIMM_vit",
  "net": "vit_base_patch16_384",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pretrained": true,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
2022-03-17 13:42:33,536.536 2829:modeling_bert.py:529   __init__(): TIMM Split image encoder load from pre-trained: True
2022-03-17 13:42:34,594.594 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:42:36,145.145 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:42:36,795.795 2829:modeling_bert.py:2677   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:42:36,943.943 2829:modeling_bert.py:2688   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:42:37,732.732 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight!  Note that this might be replaced by pre-trained checkpoint later!
2022-03-17 13:42:37,732.732 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode.
2022-03-17 13:42:38,870.870 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:42:39,407.407 2829:checkpoint.py:240       load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt
2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token                                loaded from image_encoder.module.cls_token                         of shape (1, 1, 768)
2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias                                loaded from image_encoder.module.head.bias                         of shape (1000,)
2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight                              loaded from image_encoder.module.head.weight                       of shape (1000, 768)
2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias                    loaded from image_encoder.module.patch_embed.proj.bias             of shape (768,)
2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight                  loaded from image_encoder.module.patch_embed.proj.weight           of shape (768, 3, 16, 16)
2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed                                loaded from image_encoder.module.pos_embed                         of shape (1, 577, 768)
2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias                         loaded from bert.caption_pooler.dense.bias                         of shape (768,)
2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight                       loaded from bert.caption_pooler.dense.weight                       of shape (768, 768)
2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias       loaded from bert.decoder.layer.0.attention.output.dense.bias       of shape (768,)
2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight     loaded from bert.decoder.layer.0.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias           loaded from bert.decoder.layer.0.attention.self.key.bias           of shape (768,)
2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight         loaded from bert.decoder.layer.0.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias         loaded from bert.decoder.layer.0.attention.self.query.bias         of shape (768,)
2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight       loaded from bert.decoder.layer.0.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias         loaded from bert.decoder.layer.0.attention.self.value.bias         of shape (768,)
2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight       loaded from bert.decoder.layer.0.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias           loaded from bert.decoder.layer.0.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight         loaded from bert.decoder.layer.0.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias             loaded from bert.decoder.layer.0.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight           loaded from bert.decoder.layer.0.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias                 loaded from bert.decoder.layer.0.output.dense.bias                 of shape (768,)
2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight               loaded from bert.decoder.layer.0.output.dense.weight               of shape (768, 3072)
2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias       loaded from bert.decoder.layer.1.attention.output.dense.bias       of shape (768,)
2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight     loaded from bert.decoder.layer.1.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias           loaded from bert.decoder.layer.1.attention.self.key.bias           of shape (768,)
2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight         loaded from bert.decoder.layer.1.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias         loaded from bert.decoder.layer.1.attention.self.query.bias         of shape (768,)
2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight       loaded from bert.decoder.layer.1.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias         loaded from bert.decoder.layer.1.attention.self.value.bias         of shape (768,)
2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight       loaded from bert.decoder.layer.1.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias           loaded from bert.decoder.layer.1.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight         loaded from bert.decoder.layer.1.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias             loaded from bert.decoder.layer.1.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight           loaded from bert.decoder.layer.1.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias                 loaded from bert.decoder.layer.1.output.dense.bias                 of shape (768,)
2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight               loaded from bert.decoder.layer.1.output.dense.weight               of shape (768, 3072)
2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias       loaded from bert.decoder.layer.2.attention.output.dense.bias       of shape (768,)
2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight     loaded from bert.decoder.layer.2.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias           loaded from bert.decoder.layer.2.attention.self.key.bias           of shape (768,)
2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight         loaded from bert.decoder.layer.2.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias         loaded from bert.decoder.layer.2.attention.self.query.bias         of shape (768,)
2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight       loaded from bert.decoder.layer.2.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias         loaded from bert.decoder.layer.2.attention.self.value.bias         of shape (768,)
2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight       loaded from bert.decoder.layer.2.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias           loaded from bert.decoder.layer.2.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight         loaded from bert.decoder.layer.2.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias             loaded from bert.decoder.layer.2.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight           loaded from bert.decoder.layer.2.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias                 loaded from bert.decoder.layer.2.output.dense.bias                 of shape (768,)
2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight               loaded from bert.decoder.layer.2.output.dense.weight               of shape (768, 3072)
2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias       loaded from bert.decoder.layer.3.attention.output.dense.bias       of shape (768,)
2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight     loaded from bert.decoder.layer.3.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias           loaded from bert.decoder.layer.3.attention.self.key.bias           of shape (768,)
2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight         loaded from bert.decoder.layer.3.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias         loaded from bert.decoder.layer.3.attention.self.query.bias         of shape (768,)
2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight       loaded from bert.decoder.layer.3.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias         loaded from bert.decoder.layer.3.attention.self.value.bias         of shape (768,)
2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight       loaded from bert.decoder.layer.3.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias           loaded from bert.decoder.layer.3.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight         loaded from bert.decoder.layer.3.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias             loaded from bert.decoder.layer.3.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight           loaded from bert.decoder.layer.3.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias                 loaded from bert.decoder.layer.3.output.dense.bias                 of shape (768,)
2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight               loaded from bert.decoder.layer.3.output.dense.weight               of shape (768, 3072)
2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias                         loaded from bert.embeddings.LayerNorm.bias                         of shape (768,)
2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight                       loaded from bert.embeddings.LayerNorm.weight                       of shape (768,)
2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight             loaded from bert.embeddings.position_embeddings.weight             of shape (512, 768)
2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight           loaded from bert.embeddings.token_type_embeddings.weight           of shape (2, 768)
2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight                 loaded from bert.embeddings.word_embeddings.weight                 of shape (30522, 768)
2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias                   loaded from bert.encoder.blocks.0.attn.proj.bias                   of shape (768,)
2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight                 loaded from bert.encoder.blocks.0.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias                    loaded from bert.encoder.blocks.0.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight                  loaded from bert.encoder.blocks.0.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias                     loaded from bert.encoder.blocks.0.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight                   loaded from bert.encoder.blocks.0.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias                     loaded from bert.encoder.blocks.0.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight                   loaded from bert.encoder.blocks.0.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias                       loaded from bert.encoder.blocks.0.norm1.bias                       of shape (768,)
2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight                     loaded from bert.encoder.blocks.0.norm1.weight                     of shape (768,)
2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias                       loaded from bert.encoder.blocks.0.norm2.bias                       of shape (768,)
2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight                     loaded from bert.encoder.blocks.0.norm2.weight                     of shape (768,)
2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias                   loaded from bert.encoder.blocks.1.attn.proj.bias                   of shape (768,)
2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight                 loaded from bert.encoder.blocks.1.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias                    loaded from bert.encoder.blocks.1.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight                  loaded from bert.encoder.blocks.1.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias                     loaded from bert.encoder.blocks.1.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight                   loaded from bert.encoder.blocks.1.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias                     loaded from bert.encoder.blocks.1.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight                   loaded from bert.encoder.blocks.1.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias                       loaded from bert.encoder.blocks.1.norm1.bias                       of shape (768,)
2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight                     loaded from bert.encoder.blocks.1.norm1.weight                     of shape (768,)
2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias                       loaded from bert.encoder.blocks.1.norm2.bias                       of shape (768,)
2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight                     loaded from bert.encoder.blocks.1.norm2.weight                     of shape (768,)
2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias                  loaded from bert.encoder.blocks.10.attn.proj.bias                  of shape (768,)
2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight                loaded from bert.encoder.blocks.10.attn.proj.weight                of shape (768, 768)
2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias                   loaded from bert.encoder.blocks.10.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight                 loaded from bert.encoder.blocks.10.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias                    loaded from bert.encoder.blocks.10.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight                  loaded from bert.encoder.blocks.10.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias                    loaded from bert.encoder.blocks.10.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight                  loaded from bert.encoder.blocks.10.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias                      loaded from bert.encoder.blocks.10.norm1.bias                      of shape (768,)
2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight                    loaded from bert.encoder.blocks.10.norm1.weight                    of shape (768,)
2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias                      loaded from bert.encoder.blocks.10.norm2.bias                      of shape (768,)
2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight                    loaded from bert.encoder.blocks.10.norm2.weight                    of shape (768,)
2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias                  loaded from bert.encoder.blocks.11.attn.proj.bias                  of shape (768,)
2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight                loaded from bert.encoder.blocks.11.attn.proj.weight                of shape (768, 768)
2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias                   loaded from bert.encoder.blocks.11.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight                 loaded from bert.encoder.blocks.11.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias                    loaded from bert.encoder.blocks.11.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight                  loaded from bert.encoder.blocks.11.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias                    loaded from bert.encoder.blocks.11.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight                  loaded from bert.encoder.blocks.11.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias                      loaded from bert.encoder.blocks.11.norm1.bias                      of shape (768,)
2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight                    loaded from bert.encoder.blocks.11.norm1.weight                    of shape (768,)
2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias                      loaded from bert.encoder.blocks.11.norm2.bias                      of shape (768,)
2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight                    loaded from bert.encoder.blocks.11.norm2.weight                    of shape (768,)
2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias                   loaded from bert.encoder.blocks.2.attn.proj.bias                   of shape (768,)
2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight                 loaded from bert.encoder.blocks.2.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias                    loaded from bert.encoder.blocks.2.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight                  loaded from bert.encoder.blocks.2.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias                     loaded from bert.encoder.blocks.2.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight                   loaded from bert.encoder.blocks.2.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias                     loaded from bert.encoder.blocks.2.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight                   loaded from bert.encoder.blocks.2.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias                       loaded from bert.encoder.blocks.2.norm1.bias                       of shape (768,)
2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight                     loaded from bert.encoder.blocks.2.norm1.weight                     of shape (768,)
2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias                       loaded from bert.encoder.blocks.2.norm2.bias                       of shape (768,)
2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight                     loaded from bert.encoder.blocks.2.norm2.weight                     of shape (768,)
2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias                   loaded from bert.encoder.blocks.3.attn.proj.bias                   of shape (768,)
2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight                 loaded from bert.encoder.blocks.3.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias                    loaded from bert.encoder.blocks.3.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight                  loaded from bert.encoder.blocks.3.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias                     loaded from bert.encoder.blocks.3.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight                   loaded from bert.encoder.blocks.3.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias                     loaded from bert.encoder.blocks.3.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight                   loaded from bert.encoder.blocks.3.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias                       loaded from bert.encoder.blocks.3.norm1.bias                       of shape (768,)
2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight                     loaded from bert.encoder.blocks.3.norm1.weight                     of shape (768,)
2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias                       loaded from bert.encoder.blocks.3.norm2.bias                       of shape (768,)
2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight                     loaded from bert.encoder.blocks.3.norm2.weight                     of shape (768,)
2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias                   loaded from bert.encoder.blocks.4.attn.proj.bias                   of shape (768,)
2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight                 loaded from bert.encoder.blocks.4.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias                    loaded from bert.encoder.blocks.4.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight                  loaded from bert.encoder.blocks.4.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias                     loaded from bert.encoder.blocks.4.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight                   loaded from bert.encoder.blocks.4.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias                     loaded from bert.encoder.blocks.4.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight                   loaded from bert.encoder.blocks.4.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias                       loaded from bert.encoder.blocks.4.norm1.bias                       of shape (768,)
2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight                     loaded from bert.encoder.blocks.4.norm1.weight                     of shape (768,)
2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias                       loaded from bert.encoder.blocks.4.norm2.bias                       of shape (768,)
2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight                     loaded from bert.encoder.blocks.4.norm2.weight                     of shape (768,)
2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias                   loaded from bert.encoder.blocks.5.attn.proj.bias                   of shape (768,)
2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight                 loaded from bert.encoder.blocks.5.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias                    loaded from bert.encoder.blocks.5.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight                  loaded from bert.encoder.blocks.5.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias                     loaded from bert.encoder.blocks.5.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight                   loaded from bert.encoder.blocks.5.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias                     loaded from bert.encoder.blocks.5.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight                   loaded from bert.encoder.blocks.5.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias                       loaded from bert.encoder.blocks.5.norm1.bias                       of shape (768,)
2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight                     loaded from bert.encoder.blocks.5.norm1.weight                     of shape (768,)
2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias                       loaded from bert.encoder.blocks.5.norm2.bias                       of shape (768,)
2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight                     loaded from bert.encoder.blocks.5.norm2.weight                     of shape (768,)
2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias                   loaded from bert.encoder.blocks.6.attn.proj.bias                   of shape (768,)
2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight                 loaded from bert.encoder.blocks.6.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias                    loaded from bert.encoder.blocks.6.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight                  loaded from bert.encoder.blocks.6.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias                     loaded from bert.encoder.blocks.6.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight                   loaded from bert.encoder.blocks.6.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias                     loaded from bert.encoder.blocks.6.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight                   loaded from bert.encoder.blocks.6.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias                       loaded from bert.encoder.blocks.6.norm1.bias                       of shape (768,)
2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight                     loaded from bert.encoder.blocks.6.norm1.weight                     of shape (768,)
2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias                       loaded from bert.encoder.blocks.6.norm2.bias                       of shape (768,)
2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight                     loaded from bert.encoder.blocks.6.norm2.weight                     of shape (768,)
2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias                   loaded from bert.encoder.blocks.7.attn.proj.bias                   of shape (768,)
2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight                 loaded from bert.encoder.blocks.7.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias                    loaded from bert.encoder.blocks.7.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight                  loaded from bert.encoder.blocks.7.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias                     loaded from bert.encoder.blocks.7.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight                   loaded from bert.encoder.blocks.7.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias                     loaded from bert.encoder.blocks.7.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight                   loaded from bert.encoder.blocks.7.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias                       loaded from bert.encoder.blocks.7.norm1.bias                       of shape (768,)
2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight                     loaded from bert.encoder.blocks.7.norm1.weight                     of shape (768,)
2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias                       loaded from bert.encoder.blocks.7.norm2.bias                       of shape (768,)
2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight                     loaded from bert.encoder.blocks.7.norm2.weight                     of shape (768,)
2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias                   loaded from bert.encoder.blocks.8.attn.proj.bias                   of shape (768,)
2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight                 loaded from bert.encoder.blocks.8.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias                    loaded from bert.encoder.blocks.8.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight                  loaded from bert.encoder.blocks.8.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias                     loaded from bert.encoder.blocks.8.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight                   loaded from bert.encoder.blocks.8.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias                     loaded from bert.encoder.blocks.8.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight                   loaded from bert.encoder.blocks.8.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias                       loaded from bert.encoder.blocks.8.norm1.bias                       of shape (768,)
2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight                     loaded from bert.encoder.blocks.8.norm1.weight                     of shape (768,)
2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias                       loaded from bert.encoder.blocks.8.norm2.bias                       of shape (768,)
2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight                     loaded from bert.encoder.blocks.8.norm2.weight                     of shape (768,)
2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias                   loaded from bert.encoder.blocks.9.attn.proj.bias                   of shape (768,)
2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight                 loaded from bert.encoder.blocks.9.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias                    loaded from bert.encoder.blocks.9.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight                  loaded from bert.encoder.blocks.9.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias                     loaded from bert.encoder.blocks.9.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight                   loaded from bert.encoder.blocks.9.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias                     loaded from bert.encoder.blocks.9.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight                   loaded from bert.encoder.blocks.9.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias                       loaded from bert.encoder.blocks.9.norm1.bias                       of shape (768,)
2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight                     loaded from bert.encoder.blocks.9.norm1.weight                     of shape (768,)
2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias                       loaded from bert.encoder.blocks.9.norm2.bias                       of shape (768,)
2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight                     loaded from bert.encoder.blocks.9.norm2.weight                     of shape (768,)
2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias               loaded from bert.encoder.tag_blocks.0.attn.proj.bias               of shape (768,)
2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight             loaded from bert.encoder.tag_blocks.0.attn.proj.weight             of shape (768, 768)
2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias                loaded from bert.encoder.tag_blocks.0.attn.qkv.bias                of shape (2304,)
2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight              loaded from bert.encoder.tag_blocks.0.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias                   loaded from bert.encoder.tag_blocks.0.norm1.bias                   of shape (768,)
2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight                 loaded from bert.encoder.tag_blocks.0.norm1.weight                 of shape (768,)
2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias                   loaded from bert.encoder.tag_blocks.0.norm2.bias                   of shape (768,)
2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight                 loaded from bert.encoder.tag_blocks.0.norm2.weight                 of shape (768,)
2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias               loaded from bert.encoder.tag_blocks.1.attn.proj.bias               of shape (768,)
2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight             loaded from bert.encoder.tag_blocks.1.attn.proj.weight             of shape (768, 768)
2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias                loaded from bert.encoder.tag_blocks.1.attn.qkv.bias                of shape (2304,)
2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight              loaded from bert.encoder.tag_blocks.1.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias                   loaded from bert.encoder.tag_blocks.1.norm1.bias                   of shape (768,)
2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight                 loaded from bert.encoder.tag_blocks.1.norm1.weight                 of shape (768,)
2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias                   loaded from bert.encoder.tag_blocks.1.norm2.bias                   of shape (768,)
2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight                 loaded from bert.encoder.tag_blocks.1.norm2.weight                 of shape (768,)
2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias               loaded from bert.encoder.tag_blocks.2.attn.proj.bias               of shape (768,)
2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight             loaded from bert.encoder.tag_blocks.2.attn.proj.weight             of shape (768, 768)
2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias                loaded from bert.encoder.tag_blocks.2.attn.qkv.bias                of shape (2304,)
2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight              loaded from bert.encoder.tag_blocks.2.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias                   loaded from bert.encoder.tag_blocks.2.norm1.bias                   of shape (768,)
2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight                 loaded from bert.encoder.tag_blocks.2.norm1.weight                 of shape (768,)
2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias                   loaded from bert.encoder.tag_blocks.2.norm2.bias                   of shape (768,)
2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight                 loaded from bert.encoder.tag_blocks.2.norm2.weight                 of shape (768,)
2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias               loaded from bert.encoder.tag_blocks.3.attn.proj.bias               of shape (768,)
2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight             loaded from bert.encoder.tag_blocks.3.attn.proj.weight             of shape (768, 768)
2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias                loaded from bert.encoder.tag_blocks.3.attn.qkv.bias                of shape (2304,)
2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight              loaded from bert.encoder.tag_blocks.3.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias                   loaded from bert.encoder.tag_blocks.3.norm1.bias                   of shape (768,)
2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight                 loaded from bert.encoder.tag_blocks.3.norm1.weight                 of shape (768,)
2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias                   loaded from bert.encoder.tag_blocks.3.norm2.bias                   of shape (768,)
2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight                 loaded from bert.encoder.tag_blocks.3.norm2.weight                 of shape (768,)
2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias                   loaded from bert.extra_embeddings.LayerNorm.bias                   of shape (768,)
2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight                 loaded from bert.extra_embeddings.LayerNorm.weight                 of shape (768,)
2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight       loaded from bert.extra_embeddings.position_embeddings.weight       of shape (512, 768)
2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight     loaded from bert.extra_embeddings.token_type_embeddings.weight     of shape (2, 768)
2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight           loaded from bert.extra_embeddings.word_embeddings.weight           of shape (30522, 768)
2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias                                 loaded from bert.pooler.dense.bias                                 of shape (768,)
2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight                               loaded from bert.pooler.dense.weight                               of shape (768, 768)
2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias                        loaded from bert.tag_logit.predictions.bias                        of shape (30522,)
2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight              loaded from bert.tag_logit.predictions.decoder.weight              of shape (30522, 768)
2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias    loaded from bert.tag_logit.predictions.transform.LayerNorm.bias    of shape (768,)
2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight  loaded from bert.tag_logit.predictions.transform.LayerNorm.weight  of shape (768,)
2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias        loaded from bert.tag_logit.predictions.transform.dense.bias        of shape (768,)
2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight      loaded from bert.tag_logit.predictions.transform.dense.weight      of shape (768, 768)
2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias                                   loaded from cls.predictions.bias                                   of shape (30522,)
2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight                         loaded from cls.predictions.decoder.weight                         of shape (30522, 768)
2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias               loaded from cls.predictions.transform.LayerNorm.bias               of shape (768,)
2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight             loaded from cls.predictions.transform.LayerNorm.weight             of shape (768,)
2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias                   loaded from cls.predictions.transform.dense.bias                   of shape (768,)
2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight                 loaded from cls.predictions.transform.dense.weight                 of shape (768, 768)
2022-03-17 13:42:47,809.809 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288
2022-03-17 13:42:47,809.809 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = []
2022-03-17 13:42:47,813.813 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1
2022-03-17 13:42:47,978.978 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight']
2022-03-17 13:42:48,303.303 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = <src.qd.data_layer.transform.LoadImage object at 0x7fcc8477fe10>; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False
2022-03-17 13:42:48,330.330 2829:uni_pipeline.py:509 get_data_loader(): sampler = <src.qd.data_layer.samplers.DistributedSampler object at 0x7fcba54eb6d0>
2022-03-17 13:42:48,330.330 2829:uni_pipeline.py:1000    predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:42:48,402.402 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=<src.qd.data_layer.dataset.ImageIdxTSVDataset object at 0x7fcc8477cbd0>, transform=Compose(
    <src.qd.data_layer.transform.LoadHW object at 0x7fcc8477fc90>
    Compose(
    <src.qd.data_layer.transform.LoadImage object at 0x7fcc8477fe10>
    ImageTransform2Dict(image_transform=Compose(
    <src.qd.data_layer.transform.BGR2RGB object at 0x7fcc847d5050>
    ToPILImage()
    Resize(size=384, interpolation=PIL.Image.BICUBIC)
    CenterCrop(size=(384, 384))
    ToTensor()
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
))
)
    LoadLabel(data=TaxCocoCaption, split=test, version=vinvl)
    <src.qd.data_layer.transform.IdentifyTextAB object at 0x7fcc847d5450>
    TransCaptionTensorizer(tensorizer=<src.qd.mask.data.datasets.caption_tensorizer.CaptionTensorizer object at 0x7fcbcd9b5710>, pad_to_max=True, pad_image_to_max=True)
    <src.qd.pipelines.tagger_caption_uni_pipeline_expanding.Tensorizer object at 0x7fcba54eb810>
    <src.qd.data_layer.transform.RemoveUselessKeys object at 0x7fcba54eb7d0>
    <src.qd.data_layer.transform.RenameKey object at 0x7fcba54eb790>
))

uni_pipeline.py:908:   0%|          | 0/10 [00:00<?, ?it/s]/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1639: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

uni_pipeline.py:908:  10%|█         | 1/10 [00:05<00:46,  5.18s/it]
uni_pipeline.py:908:  20%|██        | 2/10 [00:10<00:41,  5.18s/it]
uni_pipeline.py:908:  30%|███       | 3/10 [00:15<00:35,  5.02s/it]
uni_pipeline.py:908:  40%|████      | 4/10 [00:20<00:30,  5.08s/it]
uni_pipeline.py:908:  50%|█████     | 5/10 [00:25<00:25,  5.12s/it]
uni_pipeline.py:908:  60%|██████    | 6/10 [00:30<00:20,  5.03s/it]
uni_pipeline.py:908:  70%|███████   | 7/10 [00:34<00:14,  4.85s/it]
uni_pipeline.py:908:  80%|████████  | 8/10 [00:39<00:09,  4.85s/it]
uni_pipeline.py:908:  90%|█████████ | 9/10 [00:44<00:04,  4.96s/it]03-17 13:43:37.417 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49    cmd_run(): start to cmd run: nvidia-smi
03-17 13:43:37.417 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51    cmd_run(): nvidia-smi

uni_pipeline.py:908: 100%|██████████| 10/10 [00:49<00:00,  4.72s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:50<00:00,  5.02s/it]
2022-03-17 13:43:39,424.424 2829:qd_common.py:625    cmd_run(): start to cmd run: nvidia-smi
03-17 13:43:39.753 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150    monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}]
2022-03-17 13:43:41,208.208 2829:qd_common.py:670    cmd_run(): finished the cmd run
2022-03-17 13:43:41,812.812 2829:uni_pipeline.py:1008    predict(): data: 0.0003 (0.0538)  input_to_cuda: 0.0029 (0.0030)  model: 4.8330 (4.8697)  write: 0.0044 (0.0044)
2022-03-17 13:43:56,331.331 2829:qd_common.py:1157 concat_files(): concating 0/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:43:56,332.332 2829:qd_common.py:1157 concat_files(): concating 1/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_1_32.tsv
2022-03-17 13:43:56,332.332 2829:qd_common.py:1157 concat_files(): concating 2/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_2_32.tsv
2022-03-17 13:43:56,333.333 2829:qd_common.py:1157 concat_files(): concating 3/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_3_32.tsv
2022-03-17 13:43:56,333.333 2829:qd_common.py:1157 concat_files(): concating 4/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_4_32.tsv
2022-03-17 13:43:56,334.334 2829:qd_common.py:1157 concat_files(): concating 5/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_5_32.tsv
2022-03-17 13:43:56,334.334 2829:qd_common.py:1157 concat_files(): concating 6/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_6_32.tsv
2022-03-17 13:43:56,335.335 2829:qd_common.py:1157 concat_files(): concating 7/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_7_32.tsv
2022-03-17 13:43:56,335.335 2829:qd_common.py:1157 concat_files(): concating 8/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_8_32.tsv
2022-03-17 13:43:56,407.407 2829:qd_common.py:1157 concat_files(): concating 9/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_9_32.tsv
2022-03-17 13:43:56,480.480 2829:qd_common.py:1157 concat_files(): concating 10/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_10_32.tsv
2022-03-17 13:43:56,553.553 2829:qd_common.py:1157 concat_files(): concating 11/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_11_32.tsv
2022-03-17 13:43:56,624.624 2829:qd_common.py:1157 concat_files(): concating 12/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_12_32.tsv
2022-03-17 13:43:56,699.699 2829:qd_common.py:1157 concat_files(): concating 13/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_13_32.tsv
2022-03-17 13:43:56,772.772 2829:qd_common.py:1157 concat_files(): concating 14/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_14_32.tsv
2022-03-17 13:43:56,843.843 2829:qd_common.py:1157 concat_files(): concating 15/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_15_32.tsv
2022-03-17 13:43:56,914.914 2829:qd_common.py:1157 concat_files(): concating 16/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_16_32.tsv
2022-03-17 13:43:56,984.984 2829:qd_common.py:1157 concat_files(): concating 17/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_17_32.tsv
2022-03-17 13:43:57,056.056 2829:qd_common.py:1157 concat_files(): concating 18/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_18_32.tsv
2022-03-17 13:43:57,127.127 2829:qd_common.py:1157 concat_files(): concating 19/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_19_32.tsv
2022-03-17 13:43:57,199.199 2829:qd_common.py:1157 concat_files(): concating 20/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_20_32.tsv
2022-03-17 13:43:57,270.270 2829:qd_common.py:1157 concat_files(): concating 21/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_21_32.tsv
2022-03-17 13:43:57,342.342 2829:qd_common.py:1157 concat_files(): concating 22/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_22_32.tsv
2022-03-17 13:43:57,414.414 2829:qd_common.py:1157 concat_files(): concating 23/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_23_32.tsv
2022-03-17 13:43:57,485.485 2829:qd_common.py:1157 concat_files(): concating 24/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_24_32.tsv
2022-03-17 13:43:57,556.556 2829:qd_common.py:1157 concat_files(): concating 25/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_25_32.tsv
2022-03-17 13:43:57,629.629 2829:qd_common.py:1157 concat_files(): concating 26/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_26_32.tsv
2022-03-17 13:43:57,699.699 2829:qd_common.py:1157 concat_files(): concating 27/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_27_32.tsv
2022-03-17 13:43:57,770.770 2829:qd_common.py:1157 concat_files(): concating 28/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_28_32.tsv
2022-03-17 13:43:57,840.840 2829:qd_common.py:1157 concat_files(): concating 29/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_29_32.tsv
2022-03-17 13:43:57,911.911 2829:qd_common.py:1157 concat_files(): concating 30/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_30_32.tsv
2022-03-17 13:43:57,983.983 2829:qd_common.py:1157 concat_files(): concating 31/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_31_32.tsv
2022-03-17 13:44:00,250.250 2829:tsv_io.py:66 reorder_tsv_keys(): loading keys in input
2022-03-17 13:44:00,276.276 2829:tsv_io.py:328 _ensure_lineidx_loaded(): loaded 5024 from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv.before.reorder.lineidx

tsv_io.py:67:   0%|          | 0/5024 [00:00<?, ?it/s]
tsv_io.py:67: 100%|██████████| 5024/5024 [00:00<00:00, 99277.68it/s]
2022-03-17 13:44:00,401.401 2829:tsv_io.py:70   gen_rows(): writing

tsv_io.py:71:   0%|          | 0/5000 [00:00<?, ?it/s]
tsv_io.py:71: 100%|██████████| 5000/5000 [00:00<00:00, 77122.16it/s]
2022-03-17 13:44:17,424.424 2829:qd_common.py:625    cmd_run(): start to cmd run: rsync output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv.speed.vis.txt.tmp --progress
model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt

         32,768  33%    0.00kB/s    0:00:00  
         98,002 100%   62.21MB/s    0:00:00 (xfr#1, to-chk=0/1)
loading annotations into memory...
0:00:00.022254
creating index...
index created!
Loading and preparing results...     
DONE (t=0.01s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 307086 tokens at 1418592.68 tokens per second.
PTBTokenizer tokenized 57881 tokens at 501035.61 tokens per second.
setting up scorers...
computing Bleu score...
{'testlen': 48054, 'reflen': 47650, 'guess': [48054, 43054, 38054, 33054], 'correct': [36900, 20579, 10061, 4679]}
ratio: 1.0084784889821405
Bleu_1: 0.768
Bleu_2: 0.606
Bleu_3: 0.460
Bleu_4: 0.342
computing METEOR score...
METEOR: 0.276
computing Rouge score...
ROUGE_L: 0.565
computing CIDEr score...
CIDEr: 1.147
computing SPICE score...
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.nustaq.serialization.FSTClazzInfo (file:/mnt/batch/tasks/shared/LS_root/jobs/vlp32gb/azureml/zhiyuan-pytorch-test_1647405376_841ce8bd/mounts/vig_data/jianfw/data/qd_data/coco_caption/pycocoevalcap/spice/lib/fst-2.47.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of org.nustaq.serialization.FSTClazzInfo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Parsing reference captions
Parsing test captions
Initiating Stanford parsing pipeline
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... 
done [0.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [0.9 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.4 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].
Threads( StanfordCoreNLP ) [2.544 seconds]
Warning: Nashorn engine is planned to be removed from a future JDK release
SPICE evaluation took: 16.15 s
SPICE: 0.212
2022-03-17 13:44:56,047.047 2829:tagger_caption_uni_pipeline_expanding.py:1010   evaluate(): evaluation result: {'Bleu_1': 0.767886128105865, 'Bleu_2': 0.6058342074151934, 'Bleu_3': 0.45953240206742285, 'Bleu_4': 0.3423492337002243, 'METEOR': 0.27612773673054547, 'ROUGE_L': 0.5652213841376909, 'CIDEr': 1.147441978406221, 'SPICE': 0.2122467278862976}
2022-03-17 13:44:56,047.047 2829:tagger_caption_uni_pipeline_expanding.py:1011   evaluate(): evaluation result saved to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.report
2022-03-17 13:44:56,072.072 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json
2022-03-17 13:44:56,073.073 2829:modeling_utils.py:211 from_pretrained(): Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "image_captioning",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "TIMM_vit",
  "net": "vit_base_patch16_384",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pretrained": true,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
2022-03-17 13:44:56,764.764 2829:modeling_bert.py:529   __init__(): TIMM Split image encoder load from pre-trained: True
2022-03-17 13:44:57,834.834 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:44:59,390.390 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:45:00,047.047 2829:modeling_bert.py:2677   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:45:00,195.195 2829:modeling_bert.py:2688   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:45:00,990.990 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight!  Note that this might be replaced by pre-trained checkpoint later!
2022-03-17 13:45:00,990.990 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode.
2022-03-17 13:45:02,135.135 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:45:02,673.673 2829:checkpoint.py:240       load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt
2022-03-17 13:45:11,003.003 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token                                loaded from image_encoder.module.cls_token                         of shape (1, 1, 768)
2022-03-17 13:45:11,003.003 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias                                loaded from image_encoder.module.head.bias                         of shape (1000,)
2022-03-17 13:45:11,003.003 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight                              loaded from image_encoder.module.head.weight                       of shape (1000, 768)
2022-03-17 13:45:11,003.003 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias                    loaded from image_encoder.module.patch_embed.proj.bias             of shape (768,)
2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight                  loaded from image_encoder.module.patch_embed.proj.weight           of shape (768, 3, 16, 16)
2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed                                loaded from image_encoder.module.pos_embed                         of shape (1, 577, 768)
2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias                         loaded from bert.caption_pooler.dense.bias                         of shape (768,)
2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight                       loaded from bert.caption_pooler.dense.weight                       of shape (768, 768)
2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias       loaded from bert.decoder.layer.0.attention.output.dense.bias       of shape (768,)
2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight     loaded from bert.decoder.layer.0.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias           loaded from bert.decoder.layer.0.attention.self.key.bias           of shape (768,)
2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight         loaded from bert.decoder.layer.0.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias         loaded from bert.decoder.layer.0.attention.self.query.bias         of shape (768,)
2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight       loaded from bert.decoder.layer.0.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias         loaded from bert.decoder.layer.0.attention.self.value.bias         of shape (768,)
2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight       loaded from bert.decoder.layer.0.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias           loaded from bert.decoder.layer.0.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight         loaded from bert.decoder.layer.0.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias             loaded from bert.decoder.layer.0.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight           loaded from bert.decoder.layer.0.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias                 loaded from bert.decoder.layer.0.output.dense.bias                 of shape (768,)
2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight               loaded from bert.decoder.layer.0.output.dense.weight               of shape (768, 3072)
2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias       loaded from bert.decoder.layer.1.attention.output.dense.bias       of shape (768,)
2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight     loaded from bert.decoder.layer.1.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias           loaded from bert.decoder.layer.1.attention.self.key.bias           of shape (768,)
2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight         loaded from bert.decoder.layer.1.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias         loaded from bert.decoder.layer.1.attention.self.query.bias         of shape (768,)
2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight       loaded from bert.decoder.layer.1.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias         loaded from bert.decoder.layer.1.attention.self.value.bias         of shape (768,)
2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight       loaded from bert.decoder.layer.1.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias           loaded from bert.decoder.layer.1.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight         loaded from bert.decoder.layer.1.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias             loaded from bert.decoder.layer.1.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight           loaded from bert.decoder.layer.1.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias                 loaded from bert.decoder.layer.1.output.dense.bias                 of shape (768,)
2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight               loaded from bert.decoder.layer.1.output.dense.weight               of shape (768, 3072)
2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias       loaded from bert.decoder.layer.2.attention.output.dense.bias       of shape (768,)
2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight     loaded from bert.decoder.layer.2.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias           loaded from bert.decoder.layer.2.attention.self.key.bias           of shape (768,)
2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight         loaded from bert.decoder.layer.2.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias         loaded from bert.decoder.layer.2.attention.self.query.bias         of shape (768,)
2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight       loaded from bert.decoder.layer.2.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias         loaded from bert.decoder.layer.2.attention.self.value.bias         of shape (768,)
2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight       loaded from bert.decoder.layer.2.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias           loaded from bert.decoder.layer.2.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight         loaded from bert.decoder.layer.2.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias             loaded from bert.decoder.layer.2.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight           loaded from bert.decoder.layer.2.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias                 loaded from bert.decoder.layer.2.output.dense.bias                 of shape (768,)
2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight               loaded from bert.decoder.layer.2.output.dense.weight               of shape (768, 3072)
2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias       loaded from bert.decoder.layer.3.attention.output.dense.bias       of shape (768,)
2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight     loaded from bert.decoder.layer.3.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias           loaded from bert.decoder.layer.3.attention.self.key.bias           of shape (768,)
2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight         loaded from bert.decoder.layer.3.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias         loaded from bert.decoder.layer.3.attention.self.query.bias         of shape (768,)
2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight       loaded from bert.decoder.layer.3.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias         loaded from bert.decoder.layer.3.attention.self.value.bias         of shape (768,)
2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight       loaded from bert.decoder.layer.3.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias           loaded from bert.decoder.layer.3.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight         loaded from bert.decoder.layer.3.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias             loaded from bert.decoder.layer.3.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight           loaded from bert.decoder.layer.3.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias                 loaded from bert.decoder.layer.3.output.dense.bias                 of shape (768,)
2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight               loaded from bert.decoder.layer.3.output.dense.weight               of shape (768, 3072)
2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias                         loaded from bert.embeddings.LayerNorm.bias                         of shape (768,)
2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight                       loaded from bert.embeddings.LayerNorm.weight                       of shape (768,)
2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight             loaded from bert.embeddings.position_embeddings.weight             of shape (512, 768)
2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight           loaded from bert.embeddings.token_type_embeddings.weight           of shape (2, 768)
2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight                 loaded from bert.embeddings.word_embeddings.weight                 of shape (30522, 768)
2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias                   loaded from bert.encoder.blocks.0.attn.proj.bias                   of shape (768,)
2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight                 loaded from bert.encoder.blocks.0.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias                    loaded from bert.encoder.blocks.0.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight                  loaded from bert.encoder.blocks.0.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias                     loaded from bert.encoder.blocks.0.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight                   loaded from bert.encoder.blocks.0.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias                     loaded from bert.encoder.blocks.0.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight                   loaded from bert.encoder.blocks.0.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias                       loaded from bert.encoder.blocks.0.norm1.bias                       of shape (768,)
2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight                     loaded from bert.encoder.blocks.0.norm1.weight                     of shape (768,)
2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias                       loaded from bert.encoder.blocks.0.norm2.bias                       of shape (768,)
2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight                     loaded from bert.encoder.blocks.0.norm2.weight                     of shape (768,)
2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias                   loaded from bert.encoder.blocks.1.attn.proj.bias                   of shape (768,)
2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight                 loaded from bert.encoder.blocks.1.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias                    loaded from bert.encoder.blocks.1.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight                  loaded from bert.encoder.blocks.1.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias                     loaded from bert.encoder.blocks.1.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight                   loaded from bert.encoder.blocks.1.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias                     loaded from bert.encoder.blocks.1.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight                   loaded from bert.encoder.blocks.1.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias                       loaded from bert.encoder.blocks.1.norm1.bias                       of shape (768,)
2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight                     loaded from bert.encoder.blocks.1.norm1.weight                     of shape (768,)
2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias                       loaded from bert.encoder.blocks.1.norm2.bias                       of shape (768,)
2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight                     loaded from bert.encoder.blocks.1.norm2.weight                     of shape (768,)
2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias                  loaded from bert.encoder.blocks.10.attn.proj.bias                  of shape (768,)
2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight                loaded from bert.encoder.blocks.10.attn.proj.weight                of shape (768, 768)
2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias                   loaded from bert.encoder.blocks.10.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight                 loaded from bert.encoder.blocks.10.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias                    loaded from bert.encoder.blocks.10.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight                  loaded from bert.encoder.blocks.10.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias                    loaded from bert.encoder.blocks.10.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight                  loaded from bert.encoder.blocks.10.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias                      loaded from bert.encoder.blocks.10.norm1.bias                      of shape (768,)
2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight                    loaded from bert.encoder.blocks.10.norm1.weight                    of shape (768,)
2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias                      loaded from bert.encoder.blocks.10.norm2.bias                      of shape (768,)
2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight                    loaded from bert.encoder.blocks.10.norm2.weight                    of shape (768,)
2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias                  loaded from bert.encoder.blocks.11.attn.proj.bias                  of shape (768,)
2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight                loaded from bert.encoder.blocks.11.attn.proj.weight                of shape (768, 768)
2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias                   loaded from bert.encoder.blocks.11.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight                 loaded from bert.encoder.blocks.11.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias                    loaded from bert.encoder.blocks.11.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight                  loaded from bert.encoder.blocks.11.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias                    loaded from bert.encoder.blocks.11.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight                  loaded from bert.encoder.blocks.11.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias                      loaded from bert.encoder.blocks.11.norm1.bias                      of shape (768,)
2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight                    loaded from bert.encoder.blocks.11.norm1.weight                    of shape (768,)
2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias                      loaded from bert.encoder.blocks.11.norm2.bias                      of shape (768,)
2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight                    loaded from bert.encoder.blocks.11.norm2.weight                    of shape (768,)
2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias                   loaded from bert.encoder.blocks.2.attn.proj.bias                   of shape (768,)
2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight                 loaded from bert.encoder.blocks.2.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias                    loaded from bert.encoder.blocks.2.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight                  loaded from bert.encoder.blocks.2.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias                     loaded from bert.encoder.blocks.2.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight                   loaded from bert.encoder.blocks.2.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias                     loaded from bert.encoder.blocks.2.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight                   loaded from bert.encoder.blocks.2.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias                       loaded from bert.encoder.blocks.2.norm1.bias                       of shape (768,)
2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight                     loaded from bert.encoder.blocks.2.norm1.weight                     of shape (768,)
2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias                       loaded from bert.encoder.blocks.2.norm2.bias                       of shape (768,)
2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight                     loaded from bert.encoder.blocks.2.norm2.weight                     of shape (768,)
2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias                   loaded from bert.encoder.blocks.3.attn.proj.bias                   of shape (768,)
2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight                 loaded from bert.encoder.blocks.3.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias                    loaded from bert.encoder.blocks.3.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight                  loaded from bert.encoder.blocks.3.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias                     loaded from bert.encoder.blocks.3.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight                   loaded from bert.encoder.blocks.3.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias                     loaded from bert.encoder.blocks.3.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight                   loaded from bert.encoder.blocks.3.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias                       loaded from bert.encoder.blocks.3.norm1.bias                       of shape (768,)
2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight                     loaded from bert.encoder.blocks.3.norm1.weight                     of shape (768,)
2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias                       loaded from bert.encoder.blocks.3.norm2.bias                       of shape (768,)
2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight                     loaded from bert.encoder.blocks.3.norm2.weight                     of shape (768,)
2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias                   loaded from bert.encoder.blocks.4.attn.proj.bias                   of shape (768,)
2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight                 loaded from bert.encoder.blocks.4.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias                    loaded from bert.encoder.blocks.4.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight                  loaded from bert.encoder.blocks.4.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias                     loaded from bert.encoder.blocks.4.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight                   loaded from bert.encoder.blocks.4.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias                     loaded from bert.encoder.blocks.4.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight                   loaded from bert.encoder.blocks.4.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias                       loaded from bert.encoder.blocks.4.norm1.bias                       of shape (768,)
2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight                     loaded from bert.encoder.blocks.4.norm1.weight                     of shape (768,)
2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias                       loaded from bert.encoder.blocks.4.norm2.bias                       of shape (768,)
2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight                     loaded from bert.encoder.blocks.4.norm2.weight                     of shape (768,)
2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias                   loaded from bert.encoder.blocks.5.attn.proj.bias                   of shape (768,)
2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight                 loaded from bert.encoder.blocks.5.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias                    loaded from bert.encoder.blocks.5.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight                  loaded from bert.encoder.blocks.5.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias                     loaded from bert.encoder.blocks.5.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight                   loaded from bert.encoder.blocks.5.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias                     loaded from bert.encoder.blocks.5.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight                   loaded from bert.encoder.blocks.5.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias                       loaded from bert.encoder.blocks.5.norm1.bias                       of shape (768,)
2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight                     loaded from bert.encoder.blocks.5.norm1.weight                     of shape (768,)
2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias                       loaded from bert.encoder.blocks.5.norm2.bias                       of shape (768,)
2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight                     loaded from bert.encoder.blocks.5.norm2.weight                     of shape (768,)
2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias                   loaded from bert.encoder.blocks.6.attn.proj.bias                   of shape (768,)
2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight                 loaded from bert.encoder.blocks.6.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias                    loaded from bert.encoder.blocks.6.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight                  loaded from bert.encoder.blocks.6.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias                     loaded from bert.encoder.blocks.6.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight                   loaded from bert.encoder.blocks.6.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias                     loaded from bert.encoder.blocks.6.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight                   loaded from bert.encoder.blocks.6.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias                       loaded from bert.encoder.blocks.6.norm1.bias                       of shape (768,)
2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight                     loaded from bert.encoder.blocks.6.norm1.weight                     of shape (768,)
2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias                       loaded from bert.encoder.blocks.6.norm2.bias                       of shape (768,)
2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight                     loaded from bert.encoder.blocks.6.norm2.weight                     of shape (768,)
2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias                   loaded from bert.encoder.blocks.7.attn.proj.bias                   of shape (768,)
2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight                 loaded from bert.encoder.blocks.7.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias                    loaded from bert.encoder.blocks.7.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight                  loaded from bert.encoder.blocks.7.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias                     loaded from bert.encoder.blocks.7.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight                   loaded from bert.encoder.blocks.7.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias                     loaded from bert.encoder.blocks.7.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight                   loaded from bert.encoder.blocks.7.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias                       loaded from bert.encoder.blocks.7.norm1.bias                       of shape (768,)
2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight                     loaded from bert.encoder.blocks.7.norm1.weight                     of shape (768,)
2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias                       loaded from bert.encoder.blocks.7.norm2.bias                       of shape (768,)
2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight                     loaded from bert.encoder.blocks.7.norm2.weight                     of shape (768,)
2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias                   loaded from bert.encoder.blocks.8.attn.proj.bias                   of shape (768,)
2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight                 loaded from bert.encoder.blocks.8.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias                    loaded from bert.encoder.blocks.8.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight                  loaded from bert.encoder.blocks.8.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias                     loaded from bert.encoder.blocks.8.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight                   loaded from bert.encoder.blocks.8.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias                     loaded from bert.encoder.blocks.8.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight                   loaded from bert.encoder.blocks.8.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias                       loaded from bert.encoder.blocks.8.norm1.bias                       of shape (768,)
2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight                     loaded from bert.encoder.blocks.8.norm1.weight                     of shape (768,)
2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias                       loaded from bert.encoder.blocks.8.norm2.bias                       of shape (768,)
2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight                     loaded from bert.encoder.blocks.8.norm2.weight                     of shape (768,)
2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias                   loaded from bert.encoder.blocks.9.attn.proj.bias                   of shape (768,)
2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight                 loaded from bert.encoder.blocks.9.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias                    loaded from bert.encoder.blocks.9.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight                  loaded from bert.encoder.blocks.9.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias                     loaded from bert.encoder.blocks.9.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight                   loaded from bert.encoder.blocks.9.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias                     loaded from bert.encoder.blocks.9.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight                   loaded from bert.encoder.blocks.9.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias                       loaded from bert.encoder.blocks.9.norm1.bias                       of shape (768,)
2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight                     loaded from bert.encoder.blocks.9.norm1.weight                     of shape (768,)
2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias                       loaded from bert.encoder.blocks.9.norm2.bias                       of shape (768,)
2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight                     loaded from bert.encoder.blocks.9.norm2.weight                     of shape (768,)
2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias               loaded from bert.encoder.tag_blocks.0.attn.proj.bias               of shape (768,)
2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight             loaded from bert.encoder.tag_blocks.0.attn.proj.weight             of shape (768, 768)
2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias                loaded from bert.encoder.tag_blocks.0.attn.qkv.bias                of shape (2304,)
2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight              loaded from bert.encoder.tag_blocks.0.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias                   loaded from bert.encoder.tag_blocks.0.norm1.bias                   of shape (768,)
2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight                 loaded from bert.encoder.tag_blocks.0.norm1.weight                 of shape (768,)
2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias                   loaded from bert.encoder.tag_blocks.0.norm2.bias                   of shape (768,)
2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight                 loaded from bert.encoder.tag_blocks.0.norm2.weight                 of shape (768,)
2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias               loaded from bert.encoder.tag_blocks.1.attn.proj.bias               of shape (768,)
2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight             loaded from bert.encoder.tag_blocks.1.attn.proj.weight             of shape (768, 768)
2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias                loaded from bert.encoder.tag_blocks.1.attn.qkv.bias                of shape (2304,)
2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight              loaded from bert.encoder.tag_blocks.1.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias                   loaded from bert.encoder.tag_blocks.1.norm1.bias                   of shape (768,)
2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight                 loaded from bert.encoder.tag_blocks.1.norm1.weight                 of shape (768,)
2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias                   loaded from bert.encoder.tag_blocks.1.norm2.bias                   of shape (768,)
2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight                 loaded from bert.encoder.tag_blocks.1.norm2.weight                 of shape (768,)
2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias               loaded from bert.encoder.tag_blocks.2.attn.proj.bias               of shape (768,)
2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight             loaded from bert.encoder.tag_blocks.2.attn.proj.weight             of shape (768, 768)
2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias                loaded from bert.encoder.tag_blocks.2.attn.qkv.bias                of shape (2304,)
2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight              loaded from bert.encoder.tag_blocks.2.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias                   loaded from bert.encoder.tag_blocks.2.norm1.bias                   of shape (768,)
2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight                 loaded from bert.encoder.tag_blocks.2.norm1.weight                 of shape (768,)
2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias                   loaded from bert.encoder.tag_blocks.2.norm2.bias                   of shape (768,)
2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight                 loaded from bert.encoder.tag_blocks.2.norm2.weight                 of shape (768,)
2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias               loaded from bert.encoder.tag_blocks.3.attn.proj.bias               of shape (768,)
2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight             loaded from bert.encoder.tag_blocks.3.attn.proj.weight             of shape (768, 768)
2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias                loaded from bert.encoder.tag_blocks.3.attn.qkv.bias                of shape (2304,)
2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight              loaded from bert.encoder.tag_blocks.3.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias                   loaded from bert.encoder.tag_blocks.3.norm1.bias                   of shape (768,)
2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight                 loaded from bert.encoder.tag_blocks.3.norm1.weight                 of shape (768,)
2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias                   loaded from bert.encoder.tag_blocks.3.norm2.bias                   of shape (768,)
2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight                 loaded from bert.encoder.tag_blocks.3.norm2.weight                 of shape (768,)
2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias                   loaded from bert.extra_embeddings.LayerNorm.bias                   of shape (768,)
2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight                 loaded from bert.extra_embeddings.LayerNorm.weight                 of shape (768,)
2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight       loaded from bert.extra_embeddings.position_embeddings.weight       of shape (512, 768)
2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight     loaded from bert.extra_embeddings.token_type_embeddings.weight     of shape (2, 768)
2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight           loaded from bert.extra_embeddings.word_embeddings.weight           of shape (30522, 768)
2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias                                 loaded from bert.pooler.dense.bias                                 of shape (768,)
2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight                               loaded from bert.pooler.dense.weight                               of shape (768, 768)
2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias                        loaded from bert.tag_logit.predictions.bias                        of shape (30522,)
2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight              loaded from bert.tag_logit.predictions.decoder.weight              of shape (30522, 768)
2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias    loaded from bert.tag_logit.predictions.transform.LayerNorm.bias    of shape (768,)
2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight  loaded from bert.tag_logit.predictions.transform.LayerNorm.weight  of shape (768,)
2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias        loaded from bert.tag_logit.predictions.transform.dense.bias        of shape (768,)
2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight      loaded from bert.tag_logit.predictions.transform.dense.weight      of shape (768, 768)
2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias                                   loaded from cls.predictions.bias                                   of shape (30522,)
2022-03-17 13:45:11,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight                         loaded from cls.predictions.decoder.weight                         of shape (30522, 768)
2022-03-17 13:45:11,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias               loaded from cls.predictions.transform.LayerNorm.bias               of shape (768,)
2022-03-17 13:45:11,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight             loaded from cls.predictions.transform.LayerNorm.weight             of shape (768,)
2022-03-17 13:45:11,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias                   loaded from cls.predictions.transform.dense.bias                   of shape (768,)
2022-03-17 13:45:11,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight                 loaded from cls.predictions.transform.dense.weight                 of shape (768, 768)
2022-03-17 13:45:11,025.025 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288
2022-03-17 13:45:11,025.025 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = []
2022-03-17 13:45:11,028.028 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1
2022-03-17 13:45:11,196.196 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight']
2022-03-17 13:45:11,526.526 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = <src.qd.data_layer.transform.LoadImage object at 0x7fcc85563710>; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False
2022-03-17 13:45:11,552.552 2829:uni_pipeline.py:509 get_data_loader(): sampler = <src.qd.data_layer.samplers.DistributedSampler object at 0x7fcba4dd34d0>
2022-03-17 13:45:11,553.553 2829:uni_pipeline.py:1000    predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:45:11,625.625 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=<src.qd.data_layer.dataset.ImageIdxTSVDataset object at 0x7fcc855656d0>, transform=Compose(
    <src.qd.data_layer.transform.LoadHW object at 0x7fcc855636d0>
    Compose(
    <src.qd.data_layer.transform.LoadImage object at 0x7fcc85563710>
    ImageTransform2Dict(image_transform=Compose(
    <src.qd.data_layer.transform.BGR2RGB object at 0x7fccb663f0d0>
    ToPILImage()
    Resize(size=384, interpolation=PIL.Image.BICUBIC)
    CenterCrop(size=(384, 384))
    ToTensor()
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
))
)
    LoadLabel(data=TaxCocoCaption, split=test, version=vinvl)
    <src.qd.data_layer.transform.IdentifyTextAB object at 0x7fccb663ff90>
    TransCaptionTensorizer(tensorizer=<src.qd.mask.data.datasets.caption_tensorizer.CaptionTensorizer object at 0x7fcbcd9b5710>, pad_to_max=True, pad_image_to_max=True)
    <src.qd.pipelines.tagger_caption_uni_pipeline_expanding.Tensorizer object at 0x7fcba4dd3610>
    <src.qd.data_layer.transform.RemoveUselessKeys object at 0x7fcba4dd35d0>
    <src.qd.data_layer.transform.RenameKey object at 0x7fcba4dd3590>
))

uni_pipeline.py:908:   0%|          | 0/10 [00:00<?, ?it/s]/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1639: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

uni_pipeline.py:908:  10%|█         | 1/10 [00:05<00:46,  5.20s/it]
uni_pipeline.py:908:  20%|██        | 2/10 [00:10<00:43,  5.38s/it]
uni_pipeline.py:908:  30%|███       | 3/10 [00:16<00:38,  5.44s/it]
uni_pipeline.py:908:  40%|████      | 4/10 [00:21<00:32,  5.34s/it]
uni_pipeline.py:908:  50%|█████     | 5/10 [00:26<00:26,  5.28s/it]
uni_pipeline.py:908:  60%|██████    | 6/10 [00:32<00:21,  5.37s/it]
uni_pipeline.py:908:  70%|███████   | 7/10 [00:38<00:16,  5.54s/it]
uni_pipeline.py:908:  80%|████████  | 8/10 [00:43<00:10,  5.43s/it]
uni_pipeline.py:908:  90%|█████████ | 9/10 [00:48<00:05,  5.24s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:52<00:00,  4.92s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:53<00:00,  5.30s/it]
2022-03-17 13:46:05,788.788 2829:qd_common.py:625    cmd_run(): start to cmd run: nvidia-smi
2022-03-17 13:46:07,522.522 2829:qd_common.py:670    cmd_run(): finished the cmd run
2022-03-17 13:46:08,105.105 2829:uni_pipeline.py:1008    predict(): data: 0.0003 (0.0555)  input_to_cuda: 0.0029 (0.0030)  model: 5.1760 (5.1791)  write: 0.0045 (0.0045)
2022-03-17 13:46:24,104.104 2829:qd_common.py:1157 concat_files(): concating 0/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:46:24,105.105 2829:qd_common.py:1157 concat_files(): concating 1/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_1_32.tsv
2022-03-17 13:46:24,105.105 2829:qd_common.py:1157 concat_files(): concating 2/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_2_32.tsv
2022-03-17 13:46:24,106.106 2829:qd_common.py:1157 concat_files(): concating 3/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_3_32.tsv
2022-03-17 13:46:24,106.106 2829:qd_common.py:1157 concat_files(): concating 4/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_4_32.tsv
2022-03-17 13:46:24,107.107 2829:qd_common.py:1157 concat_files(): concating 5/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_5_32.tsv
2022-03-17 13:46:24,107.107 2829:qd_common.py:1157 concat_files(): concating 6/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_6_32.tsv
2022-03-17 13:46:24,108.108 2829:qd_common.py:1157 concat_files(): concating 7/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_7_32.tsv
2022-03-17 13:46:24,108.108 2829:qd_common.py:1157 concat_files(): concating 8/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_8_32.tsv
2022-03-17 13:46:24,179.179 2829:qd_common.py:1157 concat_files(): concating 9/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_9_32.tsv
2022-03-17 13:46:24,250.250 2829:qd_common.py:1157 concat_files(): concating 10/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_10_32.tsv
2022-03-17 13:46:24,321.321 2829:qd_common.py:1157 concat_files(): concating 11/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_11_32.tsv
2022-03-17 13:46:24,392.392 2829:qd_common.py:1157 concat_files(): concating 12/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_12_32.tsv
2022-03-17 13:46:24,465.465 2829:qd_common.py:1157 concat_files(): concating 13/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_13_32.tsv
2022-03-17 13:46:24,536.536 2829:qd_common.py:1157 concat_files(): concating 14/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_14_32.tsv
2022-03-17 13:46:24,608.608 2829:qd_common.py:1157 concat_files(): concating 15/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_15_32.tsv
2022-03-17 13:46:24,680.680 2829:qd_common.py:1157 concat_files(): concating 16/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_16_32.tsv
2022-03-17 13:46:24,751.751 2829:qd_common.py:1157 concat_files(): concating 17/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_17_32.tsv
2022-03-17 13:46:24,821.821 2829:qd_common.py:1157 concat_files(): concating 18/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_18_32.tsv
2022-03-17 13:46:24,892.892 2829:qd_common.py:1157 concat_files(): concating 19/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_19_32.tsv
2022-03-17 13:46:24,963.963 2829:qd_common.py:1157 concat_files(): concating 20/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_20_32.tsv
2022-03-17 13:46:25,034.034 2829:qd_common.py:1157 concat_files(): concating 21/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_21_32.tsv
2022-03-17 13:46:25,107.107 2829:qd_common.py:1157 concat_files(): concating 22/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_22_32.tsv
2022-03-17 13:46:25,178.178 2829:qd_common.py:1157 concat_files(): concating 23/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_23_32.tsv
2022-03-17 13:46:25,248.248 2829:qd_common.py:1157 concat_files(): concating 24/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_24_32.tsv
2022-03-17 13:46:25,320.320 2829:qd_common.py:1157 concat_files(): concating 25/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_25_32.tsv
2022-03-17 13:46:25,392.392 2829:qd_common.py:1157 concat_files(): concating 26/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_26_32.tsv
2022-03-17 13:46:25,462.462 2829:qd_common.py:1157 concat_files(): concating 27/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_27_32.tsv
2022-03-17 13:46:25,535.535 2829:qd_common.py:1157 concat_files(): concating 28/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_28_32.tsv
2022-03-17 13:46:25,606.606 2829:qd_common.py:1157 concat_files(): concating 29/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_29_32.tsv
2022-03-17 13:46:25,676.676 2829:qd_common.py:1157 concat_files(): concating 30/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_30_32.tsv
2022-03-17 13:46:25,748.748 2829:qd_common.py:1157 concat_files(): concating 31/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_31_32.tsv
2022-03-17 13:46:28,024.024 2829:tsv_io.py:66 reorder_tsv_keys(): loading keys in input
2022-03-17 13:46:28,050.050 2829:tsv_io.py:328 _ensure_lineidx_loaded(): loaded 5024 from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv.before.reorder.lineidx

tsv_io.py:67:   0%|          | 0/5024 [00:00<?, ?it/s]
tsv_io.py:67: 100%|██████████| 5024/5024 [00:00<00:00, 104453.22it/s]
2022-03-17 13:46:28,172.172 2829:tsv_io.py:70   gen_rows(): writing

tsv_io.py:71:   0%|          | 0/5000 [00:00<?, ?it/s]
tsv_io.py:71: 100%|██████████| 5000/5000 [00:00<00:00, 73860.28it/s]
2022-03-17 13:46:45,229.229 2829:qd_common.py:625    cmd_run(): start to cmd run: rsync output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv.speed.vis.txt.tmp --progress
model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt

         32,768  33%    0.00kB/s    0:00:00  
         98,002 100%   62.21MB/s    0:00:00 (xfr#1, to-chk=0/1)
loading annotations into memory...
0:00:00.022309
creating index...
index created!
Loading and preparing results...     
DONE (t=0.01s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 307086 tokens at 1415311.19 tokens per second.
PTBTokenizer tokenized 59984 tokens at 505547.50 tokens per second.
setting up scorers...
computing Bleu score...
{'testlen': 50167, 'reflen': 49247, 'guess': [50167, 45167, 40167, 35167], 'correct': [38081, 21376, 10634, 5121]}
ratio: 1.0186813409953699
Bleu_1: 0.759
Bleu_2: 0.599
Bleu_3: 0.456
Bleu_4: 0.343
computing METEOR score...
METEOR: 0.281
computing Rouge score...
ROUGE_L: 0.567
computing CIDEr score...
CIDEr: 1.161
computing SPICE score...
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.nustaq.serialization.FSTClazzInfo (file:/mnt/batch/tasks/shared/LS_root/jobs/vlp32gb/azureml/zhiyuan-pytorch-test_1647405376_841ce8bd/mounts/vig_data/jianfw/data/qd_data/coco_caption/pycocoevalcap/spice/lib/fst-2.47.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of org.nustaq.serialization.FSTClazzInfo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Parsing reference captions
Parsing test captions
Initiating Stanford parsing pipeline
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... 
done [0.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [0.9 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.4 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].
Threads( StanfordCoreNLP ) [3.587 seconds]
Warning: Nashorn engine is planned to be removed from a future JDK release
SPICE evaluation took: 17.37 s
SPICE: 0.216
2022-03-17 13:47:25,411.411 2829:tagger_caption_uni_pipeline_expanding.py:1010   evaluate(): evaluation result: {'Bleu_1': 0.7590846572447872, 'Bleu_2': 0.5993737300487105, 'Bleu_3': 0.45646507675193304, 'Bleu_4': 0.3430524542575655, 'METEOR': 0.2812666064198821, 'ROUGE_L': 0.5673075800793282, 'CIDEr': 1.1605983292640667, 'SPICE': 0.2159017006757662}
2022-03-17 13:47:25,412.412 2829:tagger_caption_uni_pipeline_expanding.py:1011   evaluate(): evaluation result saved to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.report
2022-03-17 13:47:25,437.437 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json
2022-03-17 13:47:25,438.438 2829:modeling_utils.py:211 from_pretrained(): Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "image_captioning",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "TIMM_vit",
  "net": "vit_base_patch16_384",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pretrained": true,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
2022-03-17 13:47:26,121.121 2829:modeling_bert.py:529   __init__(): TIMM Split image encoder load from pre-trained: True
2022-03-17 13:47:27,192.192 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:47:28,713.713 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:47:29,361.361 2829:modeling_bert.py:2677   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:47:29,509.509 2829:modeling_bert.py:2688   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:47:30,301.301 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight!  Note that this might be replaced by pre-trained checkpoint later!
2022-03-17 13:47:30,301.301 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode.
2022-03-17 13:47:31,440.440 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:47:31,980.980 2829:checkpoint.py:240       load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt
2022-03-17 13:47:40,532.532 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token                                loaded from image_encoder.module.cls_token                         of shape (1, 1, 768)
2022-03-17 13:47:40,532.532 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias                                loaded from image_encoder.module.head.bias                         of shape (1000,)
2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight                              loaded from image_encoder.module.head.weight                       of shape (1000, 768)
2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias                    loaded from image_encoder.module.patch_embed.proj.bias             of shape (768,)
2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight                  loaded from image_encoder.module.patch_embed.proj.weight           of shape (768, 3, 16, 16)
2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed                                loaded from image_encoder.module.pos_embed                         of shape (1, 577, 768)
2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias                         loaded from bert.caption_pooler.dense.bias                         of shape (768,)
2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight                       loaded from bert.caption_pooler.dense.weight                       of shape (768, 768)
2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias       loaded from bert.decoder.layer.0.attention.output.dense.bias       of shape (768,)
2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight     loaded from bert.decoder.layer.0.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias           loaded from bert.decoder.layer.0.attention.self.key.bias           of shape (768,)
2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight         loaded from bert.decoder.layer.0.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias         loaded from bert.decoder.layer.0.attention.self.query.bias         of shape (768,)
2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight       loaded from bert.decoder.layer.0.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias         loaded from bert.decoder.layer.0.attention.self.value.bias         of shape (768,)
2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight       loaded from bert.decoder.layer.0.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias           loaded from bert.decoder.layer.0.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight         loaded from bert.decoder.layer.0.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias             loaded from bert.decoder.layer.0.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight           loaded from bert.decoder.layer.0.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias                 loaded from bert.decoder.layer.0.output.dense.bias                 of shape (768,)
2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight               loaded from bert.decoder.layer.0.output.dense.weight               of shape (768, 3072)
2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias       loaded from bert.decoder.layer.1.attention.output.dense.bias       of shape (768,)
2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight     loaded from bert.decoder.layer.1.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias           loaded from bert.decoder.layer.1.attention.self.key.bias           of shape (768,)
2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight         loaded from bert.decoder.layer.1.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias         loaded from bert.decoder.layer.1.attention.self.query.bias         of shape (768,)
2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight       loaded from bert.decoder.layer.1.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias         loaded from bert.decoder.layer.1.attention.self.value.bias         of shape (768,)
2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight       loaded from bert.decoder.layer.1.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias           loaded from bert.decoder.layer.1.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight         loaded from bert.decoder.layer.1.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias             loaded from bert.decoder.layer.1.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight           loaded from bert.decoder.layer.1.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias                 loaded from bert.decoder.layer.1.output.dense.bias                 of shape (768,)
2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight               loaded from bert.decoder.layer.1.output.dense.weight               of shape (768, 3072)
2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias       loaded from bert.decoder.layer.2.attention.output.dense.bias       of shape (768,)
2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight     loaded from bert.decoder.layer.2.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias           loaded from bert.decoder.layer.2.attention.self.key.bias           of shape (768,)
2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight         loaded from bert.decoder.layer.2.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias         loaded from bert.decoder.layer.2.attention.self.query.bias         of shape (768,)
2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight       loaded from bert.decoder.layer.2.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias         loaded from bert.decoder.layer.2.attention.self.value.bias         of shape (768,)
2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight       loaded from bert.decoder.layer.2.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias           loaded from bert.decoder.layer.2.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight         loaded from bert.decoder.layer.2.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias             loaded from bert.decoder.layer.2.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight           loaded from bert.decoder.layer.2.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias                 loaded from bert.decoder.layer.2.output.dense.bias                 of shape (768,)
2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight               loaded from bert.decoder.layer.2.output.dense.weight               of shape (768, 3072)
2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias       loaded from bert.decoder.layer.3.attention.output.dense.bias       of shape (768,)
2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight     loaded from bert.decoder.layer.3.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias           loaded from bert.decoder.layer.3.attention.self.key.bias           of shape (768,)
2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight         loaded from bert.decoder.layer.3.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias         loaded from bert.decoder.layer.3.attention.self.query.bias         of shape (768,)
2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight       loaded from bert.decoder.layer.3.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias         loaded from bert.decoder.layer.3.attention.self.value.bias         of shape (768,)
2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight       loaded from bert.decoder.layer.3.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias           loaded from bert.decoder.layer.3.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight         loaded from bert.decoder.layer.3.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias             loaded from bert.decoder.layer.3.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight           loaded from bert.decoder.layer.3.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias                 loaded from bert.decoder.layer.3.output.dense.bias                 of shape (768,)
2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight               loaded from bert.decoder.layer.3.output.dense.weight               of shape (768, 3072)
2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias                         loaded from bert.embeddings.LayerNorm.bias                         of shape (768,)
2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight                       loaded from bert.embeddings.LayerNorm.weight                       of shape (768,)
2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight             loaded from bert.embeddings.position_embeddings.weight             of shape (512, 768)
2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight           loaded from bert.embeddings.token_type_embeddings.weight           of shape (2, 768)
2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight                 loaded from bert.embeddings.word_embeddings.weight                 of shape (30522, 768)
2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias                   loaded from bert.encoder.blocks.0.attn.proj.bias                   of shape (768,)
2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight                 loaded from bert.encoder.blocks.0.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias                    loaded from bert.encoder.blocks.0.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight                  loaded from bert.encoder.blocks.0.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias                     loaded from bert.encoder.blocks.0.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight                   loaded from bert.encoder.blocks.0.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias                     loaded from bert.encoder.blocks.0.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight                   loaded from bert.encoder.blocks.0.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias                       loaded from bert.encoder.blocks.0.norm1.bias                       of shape (768,)
2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight                     loaded from bert.encoder.blocks.0.norm1.weight                     of shape (768,)
2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias                       loaded from bert.encoder.blocks.0.norm2.bias                       of shape (768,)
2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight                     loaded from bert.encoder.blocks.0.norm2.weight                     of shape (768,)
2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias                   loaded from bert.encoder.blocks.1.attn.proj.bias                   of shape (768,)
2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight                 loaded from bert.encoder.blocks.1.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias                    loaded from bert.encoder.blocks.1.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight                  loaded from bert.encoder.blocks.1.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias                     loaded from bert.encoder.blocks.1.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight                   loaded from bert.encoder.blocks.1.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias                     loaded from bert.encoder.blocks.1.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight                   loaded from bert.encoder.blocks.1.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias                       loaded from bert.encoder.blocks.1.norm1.bias                       of shape (768,)
2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight                     loaded from bert.encoder.blocks.1.norm1.weight                     of shape (768,)
2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias                       loaded from bert.encoder.blocks.1.norm2.bias                       of shape (768,)
2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight                     loaded from bert.encoder.blocks.1.norm2.weight                     of shape (768,)
2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias                  loaded from bert.encoder.blocks.10.attn.proj.bias                  of shape (768,)
2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight                loaded from bert.encoder.blocks.10.attn.proj.weight                of shape (768, 768)
2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias                   loaded from bert.encoder.blocks.10.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight                 loaded from bert.encoder.blocks.10.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias                    loaded from bert.encoder.blocks.10.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight                  loaded from bert.encoder.blocks.10.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias                    loaded from bert.encoder.blocks.10.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight                  loaded from bert.encoder.blocks.10.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias                      loaded from bert.encoder.blocks.10.norm1.bias                      of shape (768,)
2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight                    loaded from bert.encoder.blocks.10.norm1.weight                    of shape (768,)
2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias                      loaded from bert.encoder.blocks.10.norm2.bias                      of shape (768,)
2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight                    loaded from bert.encoder.blocks.10.norm2.weight                    of shape (768,)
2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias                  loaded from bert.encoder.blocks.11.attn.proj.bias                  of shape (768,)
2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight                loaded from bert.encoder.blocks.11.attn.proj.weight                of shape (768, 768)
2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias                   loaded from bert.encoder.blocks.11.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight                 loaded from bert.encoder.blocks.11.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias                    loaded from bert.encoder.blocks.11.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight                  loaded from bert.encoder.blocks.11.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias                    loaded from bert.encoder.blocks.11.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight                  loaded from bert.encoder.blocks.11.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias                      loaded from bert.encoder.blocks.11.norm1.bias                      of shape (768,)
2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight                    loaded from bert.encoder.blocks.11.norm1.weight                    of shape (768,)
2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias                      loaded from bert.encoder.blocks.11.norm2.bias                      of shape (768,)
2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight                    loaded from bert.encoder.blocks.11.norm2.weight                    of shape (768,)
2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias                   loaded from bert.encoder.blocks.2.attn.proj.bias                   of shape (768,)
2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight                 loaded from bert.encoder.blocks.2.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias                    loaded from bert.encoder.blocks.2.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight                  loaded from bert.encoder.blocks.2.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias                     loaded from bert.encoder.blocks.2.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight                   loaded from bert.encoder.blocks.2.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias                     loaded from bert.encoder.blocks.2.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight                   loaded from bert.encoder.blocks.2.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias                       loaded from bert.encoder.blocks.2.norm1.bias                       of shape (768,)
2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight                     loaded from bert.encoder.blocks.2.norm1.weight                     of shape (768,)
2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias                       loaded from bert.encoder.blocks.2.norm2.bias                       of shape (768,)
2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight                     loaded from bert.encoder.blocks.2.norm2.weight                     of shape (768,)
2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias                   loaded from bert.encoder.blocks.3.attn.proj.bias                   of shape (768,)
2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight                 loaded from bert.encoder.blocks.3.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias                    loaded from bert.encoder.blocks.3.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight                  loaded from bert.encoder.blocks.3.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias                     loaded from bert.encoder.blocks.3.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight                   loaded from bert.encoder.blocks.3.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias                     loaded from bert.encoder.blocks.3.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight                   loaded from bert.encoder.blocks.3.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias                       loaded from bert.encoder.blocks.3.norm1.bias                       of shape (768,)
2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight                     loaded from bert.encoder.blocks.3.norm1.weight                     of shape (768,)
2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias                       loaded from bert.encoder.blocks.3.norm2.bias                       of shape (768,)
2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight                     loaded from bert.encoder.blocks.3.norm2.weight                     of shape (768,)
2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias                   loaded from bert.encoder.blocks.4.attn.proj.bias                   of shape (768,)
2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight                 loaded from bert.encoder.blocks.4.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias                    loaded from bert.encoder.blocks.4.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight                  loaded from bert.encoder.blocks.4.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias                     loaded from bert.encoder.blocks.4.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight                   loaded from bert.encoder.blocks.4.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias                     loaded from bert.encoder.blocks.4.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight                   loaded from bert.encoder.blocks.4.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias                       loaded from bert.encoder.blocks.4.norm1.bias                       of shape (768,)
2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight                     loaded from bert.encoder.blocks.4.norm1.weight                     of shape (768,)
2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias                       loaded from bert.encoder.blocks.4.norm2.bias                       of shape (768,)
2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight                     loaded from bert.encoder.blocks.4.norm2.weight                     of shape (768,)
2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias                   loaded from bert.encoder.blocks.5.attn.proj.bias                   of shape (768,)
2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight                 loaded from bert.encoder.blocks.5.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias                    loaded from bert.encoder.blocks.5.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight                  loaded from bert.encoder.blocks.5.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias                     loaded from bert.encoder.blocks.5.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight                   loaded from bert.encoder.blocks.5.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias                     loaded from bert.encoder.blocks.5.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight                   loaded from bert.encoder.blocks.5.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias                       loaded from bert.encoder.blocks.5.norm1.bias                       of shape (768,)
2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight                     loaded from bert.encoder.blocks.5.norm1.weight                     of shape (768,)
2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias                       loaded from bert.encoder.blocks.5.norm2.bias                       of shape (768,)
2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight                     loaded from bert.encoder.blocks.5.norm2.weight                     of shape (768,)
2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias                   loaded from bert.encoder.blocks.6.attn.proj.bias                   of shape (768,)
2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight                 loaded from bert.encoder.blocks.6.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias                    loaded from bert.encoder.blocks.6.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight                  loaded from bert.encoder.blocks.6.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias                     loaded from bert.encoder.blocks.6.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight                   loaded from bert.encoder.blocks.6.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias                     loaded from bert.encoder.blocks.6.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight                   loaded from bert.encoder.blocks.6.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias                       loaded from bert.encoder.blocks.6.norm1.bias                       of shape (768,)
2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight                     loaded from bert.encoder.blocks.6.norm1.weight                     of shape (768,)
2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias                       loaded from bert.encoder.blocks.6.norm2.bias                       of shape (768,)
2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight                     loaded from bert.encoder.blocks.6.norm2.weight                     of shape (768,)
2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias                   loaded from bert.encoder.blocks.7.attn.proj.bias                   of shape (768,)
2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight                 loaded from bert.encoder.blocks.7.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias                    loaded from bert.encoder.blocks.7.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight                  loaded from bert.encoder.blocks.7.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias                     loaded from bert.encoder.blocks.7.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight                   loaded from bert.encoder.blocks.7.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias                     loaded from bert.encoder.blocks.7.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight                   loaded from bert.encoder.blocks.7.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias                       loaded from bert.encoder.blocks.7.norm1.bias                       of shape (768,)
2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight                     loaded from bert.encoder.blocks.7.norm1.weight                     of shape (768,)
2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias                       loaded from bert.encoder.blocks.7.norm2.bias                       of shape (768,)
2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight                     loaded from bert.encoder.blocks.7.norm2.weight                     of shape (768,)
2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias                   loaded from bert.encoder.blocks.8.attn.proj.bias                   of shape (768,)
2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight                 loaded from bert.encoder.blocks.8.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias                    loaded from bert.encoder.blocks.8.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight                  loaded from bert.encoder.blocks.8.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias                     loaded from bert.encoder.blocks.8.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight                   loaded from bert.encoder.blocks.8.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias                     loaded from bert.encoder.blocks.8.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight                   loaded from bert.encoder.blocks.8.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias                       loaded from bert.encoder.blocks.8.norm1.bias                       of shape (768,)
2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight                     loaded from bert.encoder.blocks.8.norm1.weight                     of shape (768,)
2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias                       loaded from bert.encoder.blocks.8.norm2.bias                       of shape (768,)
2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight                     loaded from bert.encoder.blocks.8.norm2.weight                     of shape (768,)
2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias                   loaded from bert.encoder.blocks.9.attn.proj.bias                   of shape (768,)
2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight                 loaded from bert.encoder.blocks.9.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias                    loaded from bert.encoder.blocks.9.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight                  loaded from bert.encoder.blocks.9.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias                     loaded from bert.encoder.blocks.9.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight                   loaded from bert.encoder.blocks.9.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias                     loaded from bert.encoder.blocks.9.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight                   loaded from bert.encoder.blocks.9.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias                       loaded from bert.encoder.blocks.9.norm1.bias                       of shape (768,)
2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight                     loaded from bert.encoder.blocks.9.norm1.weight                     of shape (768,)
2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias                       loaded from bert.encoder.blocks.9.norm2.bias                       of shape (768,)
2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight                     loaded from bert.encoder.blocks.9.norm2.weight                     of shape (768,)
2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias               loaded from bert.encoder.tag_blocks.0.attn.proj.bias               of shape (768,)
2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight             loaded from bert.encoder.tag_blocks.0.attn.proj.weight             of shape (768, 768)
2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias                loaded from bert.encoder.tag_blocks.0.attn.qkv.bias                of shape (2304,)
2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight              loaded from bert.encoder.tag_blocks.0.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias                   loaded from bert.encoder.tag_blocks.0.norm1.bias                   of shape (768,)
2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight                 loaded from bert.encoder.tag_blocks.0.norm1.weight                 of shape (768,)
2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias                   loaded from bert.encoder.tag_blocks.0.norm2.bias                   of shape (768,)
2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight                 loaded from bert.encoder.tag_blocks.0.norm2.weight                 of shape (768,)
2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias               loaded from bert.encoder.tag_blocks.1.attn.proj.bias               of shape (768,)
2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight             loaded from bert.encoder.tag_blocks.1.attn.proj.weight             of shape (768, 768)
2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias                loaded from bert.encoder.tag_blocks.1.attn.qkv.bias                of shape (2304,)
2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight              loaded from bert.encoder.tag_blocks.1.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias                   loaded from bert.encoder.tag_blocks.1.norm1.bias                   of shape (768,)
2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight                 loaded from bert.encoder.tag_blocks.1.norm1.weight                 of shape (768,)
2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias                   loaded from bert.encoder.tag_blocks.1.norm2.bias                   of shape (768,)
2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight                 loaded from bert.encoder.tag_blocks.1.norm2.weight                 of shape (768,)
2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias               loaded from bert.encoder.tag_blocks.2.attn.proj.bias               of shape (768,)
2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight             loaded from bert.encoder.tag_blocks.2.attn.proj.weight             of shape (768, 768)
2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias                loaded from bert.encoder.tag_blocks.2.attn.qkv.bias                of shape (2304,)
2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight              loaded from bert.encoder.tag_blocks.2.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias                   loaded from bert.encoder.tag_blocks.2.norm1.bias                   of shape (768,)
2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight                 loaded from bert.encoder.tag_blocks.2.norm1.weight                 of shape (768,)
2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias                   loaded from bert.encoder.tag_blocks.2.norm2.bias                   of shape (768,)
2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight                 loaded from bert.encoder.tag_blocks.2.norm2.weight                 of shape (768,)
2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias               loaded from bert.encoder.tag_blocks.3.attn.proj.bias               of shape (768,)
2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight             loaded from bert.encoder.tag_blocks.3.attn.proj.weight             of shape (768, 768)
2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias                loaded from bert.encoder.tag_blocks.3.attn.qkv.bias                of shape (2304,)
2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight              loaded from bert.encoder.tag_blocks.3.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias                   loaded from bert.encoder.tag_blocks.3.norm1.bias                   of shape (768,)
2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight                 loaded from bert.encoder.tag_blocks.3.norm1.weight                 of shape (768,)
2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias                   loaded from bert.encoder.tag_blocks.3.norm2.bias                   of shape (768,)
2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight                 loaded from bert.encoder.tag_blocks.3.norm2.weight                 of shape (768,)
2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias                   loaded from bert.extra_embeddings.LayerNorm.bias                   of shape (768,)
2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight                 loaded from bert.extra_embeddings.LayerNorm.weight                 of shape (768,)
2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight       loaded from bert.extra_embeddings.position_embeddings.weight       of shape (512, 768)
2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight     loaded from bert.extra_embeddings.token_type_embeddings.weight     of shape (2, 768)
2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight           loaded from bert.extra_embeddings.word_embeddings.weight           of shape (30522, 768)
2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias                                 loaded from bert.pooler.dense.bias                                 of shape (768,)
2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight                               loaded from bert.pooler.dense.weight                               of shape (768, 768)
2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias                        loaded from bert.tag_logit.predictions.bias                        of shape (30522,)
2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight              loaded from bert.tag_logit.predictions.decoder.weight              of shape (30522, 768)
2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias    loaded from bert.tag_logit.predictions.transform.LayerNorm.bias    of shape (768,)
2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight  loaded from bert.tag_logit.predictions.transform.LayerNorm.weight  of shape (768,)
2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias        loaded from bert.tag_logit.predictions.transform.dense.bias        of shape (768,)
2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight      loaded from bert.tag_logit.predictions.transform.dense.weight      of shape (768, 768)
2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias                                   loaded from cls.predictions.bias                                   of shape (30522,)
2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight                         loaded from cls.predictions.decoder.weight                         of shape (30522, 768)
2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias               loaded from cls.predictions.transform.LayerNorm.bias               of shape (768,)
2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight             loaded from cls.predictions.transform.LayerNorm.weight             of shape (768,)
2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias                   loaded from cls.predictions.transform.dense.bias                   of shape (768,)
2022-03-17 13:47:40,554.554 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight                 loaded from cls.predictions.transform.dense.weight                 of shape (768, 768)
2022-03-17 13:47:40,554.554 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288
2022-03-17 13:47:40,554.554 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = []
2022-03-17 13:47:40,557.557 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1
2022-03-17 13:47:40,725.725 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight']
2022-03-17 13:47:41,050.050 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = <src.qd.data_layer.transform.LoadImage object at 0x7fcc8468fe90>; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False
2022-03-17 13:47:41,076.076 2829:uni_pipeline.py:509 get_data_loader(): sampler = <src.qd.data_layer.samplers.DistributedSampler object at 0x7fcc85613050>
2022-03-17 13:47:41,076.076 2829:uni_pipeline.py:1000    predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:47:41,148.148 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=<src.qd.data_layer.dataset.ImageIdxTSVDataset object at 0x7fcc8468f310>, transform=Compose(
    <src.qd.data_layer.transform.LoadHW object at 0x7fcc8468f290>
    Compose(
    <src.qd.data_layer.transform.LoadImage object at 0x7fcc8468fe90>
    ImageTransform2Dict(image_transform=Compose(
    <src.qd.data_layer.transform.BGR2RGB object at 0x7fcbc0445090>
    ToPILImage()
    Resize(size=384, interpolation=PIL.Image.BICUBIC)
    CenterCrop(size=(384, 384))
    ToTensor()
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
))
)
    LoadLabel(data=TaxCocoCaption, split=test, version=vinvl)
    <src.qd.data_layer.transform.IdentifyTextAB object at 0x7fcbc0445ad0>
    TransCaptionTensorizer(tensorizer=<src.qd.mask.data.datasets.caption_tensorizer.CaptionTensorizer object at 0x7fcbcd9b5710>, pad_to_max=True, pad_image_to_max=True)
    <src.qd.pipelines.tagger_caption_uni_pipeline_expanding.Tensorizer object at 0x7fcc84693150>
    <src.qd.data_layer.transform.RemoveUselessKeys object at 0x7fcc84693110>
    <src.qd.data_layer.transform.RenameKey object at 0x7fcc846930d0>
))

uni_pipeline.py:908:   0%|          | 0/10 [00:00<?, ?it/s]/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1639: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

uni_pipeline.py:908:  10%|█         | 1/10 [00:05<00:46,  5.19s/it]
uni_pipeline.py:908:  20%|██        | 2/10 [00:10<00:43,  5.38s/it]
uni_pipeline.py:908:  30%|███       | 3/10 [00:15<00:35,  5.13s/it]
uni_pipeline.py:908:  40%|████      | 4/10 [00:21<00:31,  5.29s/it]
uni_pipeline.py:908:  50%|█████     | 5/10 [00:25<00:25,  5.13s/it]
uni_pipeline.py:908:  60%|██████    | 6/10 [00:30<00:20,  5.03s/it]
uni_pipeline.py:908:  70%|███████   | 7/10 [00:35<00:15,  5.09s/it]
uni_pipeline.py:908:  80%|████████  | 8/10 [00:40<00:10,  5.01s/it]
uni_pipeline.py:908:  90%|█████████ | 9/10 [00:45<00:05,  5.07s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:50<00:00,  4.79s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:57<00:00,  5.73s/it]
2022-03-17 13:48:39,460.460 2829:qd_common.py:625    cmd_run(): start to cmd run: nvidia-smi
2022-03-17 13:48:40,391.391 2829:qd_common.py:670    cmd_run(): finished the cmd run
2022-03-17 13:48:40,971.971 2829:uni_pipeline.py:1008    predict(): data: 0.0003 (0.0536)  input_to_cuda: 0.0029 (0.0030)  model: 4.8440 (4.9751)  write: 0.0045 (0.0045)
2022-03-17 13:48:51,330.330 2829:qd_common.py:1157 concat_files(): concating 0/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:48:51,331.331 2829:qd_common.py:1157 concat_files(): concating 1/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_1_32.tsv
2022-03-17 13:48:51,332.332 2829:qd_common.py:1157 concat_files(): concating 2/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_2_32.tsv
2022-03-17 13:48:51,333.333 2829:qd_common.py:1157 concat_files(): concating 3/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_3_32.tsv
2022-03-17 13:48:51,333.333 2829:qd_common.py:1157 concat_files(): concating 4/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_4_32.tsv
2022-03-17 13:48:51,334.334 2829:qd_common.py:1157 concat_files(): concating 5/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_5_32.tsv
2022-03-17 13:48:51,334.334 2829:qd_common.py:1157 concat_files(): concating 6/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_6_32.tsv
2022-03-17 13:48:51,335.335 2829:qd_common.py:1157 concat_files(): concating 7/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_7_32.tsv
2022-03-17 13:48:51,335.335 2829:qd_common.py:1157 concat_files(): concating 8/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_8_32.tsv
2022-03-17 13:48:51,406.406 2829:qd_common.py:1157 concat_files(): concating 9/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_9_32.tsv
2022-03-17 13:48:51,478.478 2829:qd_common.py:1157 concat_files(): concating 10/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_10_32.tsv
2022-03-17 13:48:51,548.548 2829:qd_common.py:1157 concat_files(): concating 11/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_11_32.tsv
2022-03-17 13:48:51,619.619 2829:qd_common.py:1157 concat_files(): concating 12/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_12_32.tsv
2022-03-17 13:48:51,689.689 2829:qd_common.py:1157 concat_files(): concating 13/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_13_32.tsv
2022-03-17 13:48:51,760.760 2829:qd_common.py:1157 concat_files(): concating 14/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_14_32.tsv
2022-03-17 13:48:51,831.831 2829:qd_common.py:1157 concat_files(): concating 15/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_15_32.tsv
2022-03-17 13:48:51,902.902 2829:qd_common.py:1157 concat_files(): concating 16/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_16_32.tsv
2022-03-17 13:48:51,972.972 2829:qd_common.py:1157 concat_files(): concating 17/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_17_32.tsv
2022-03-17 13:48:52,044.044 2829:qd_common.py:1157 concat_files(): concating 18/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_18_32.tsv
2022-03-17 13:48:52,116.116 2829:qd_common.py:1157 concat_files(): concating 19/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_19_32.tsv
2022-03-17 13:48:52,187.187 2829:qd_common.py:1157 concat_files(): concating 20/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_20_32.tsv
2022-03-17 13:48:52,258.258 2829:qd_common.py:1157 concat_files(): concating 21/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_21_32.tsv
2022-03-17 13:48:52,329.329 2829:qd_common.py:1157 concat_files(): concating 22/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_22_32.tsv
2022-03-17 13:48:52,400.400 2829:qd_common.py:1157 concat_files(): concating 23/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_23_32.tsv
2022-03-17 13:48:52,472.472 2829:qd_common.py:1157 concat_files(): concating 24/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_24_32.tsv
2022-03-17 13:48:52,543.543 2829:qd_common.py:1157 concat_files(): concating 25/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_25_32.tsv
2022-03-17 13:48:52,615.615 2829:qd_common.py:1157 concat_files(): concating 26/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_26_32.tsv
2022-03-17 13:48:52,686.686 2829:qd_common.py:1157 concat_files(): concating 27/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_27_32.tsv
2022-03-17 13:48:52,757.757 2829:qd_common.py:1157 concat_files(): concating 28/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_28_32.tsv
2022-03-17 13:48:52,827.827 2829:qd_common.py:1157 concat_files(): concating 29/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_29_32.tsv
2022-03-17 13:48:52,898.898 2829:qd_common.py:1157 concat_files(): concating 30/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_30_32.tsv
2022-03-17 13:48:52,970.970 2829:qd_common.py:1157 concat_files(): concating 31/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_31_32.tsv
2022-03-17 13:48:55,235.235 2829:tsv_io.py:66 reorder_tsv_keys(): loading keys in input
2022-03-17 13:48:55,261.261 2829:tsv_io.py:328 _ensure_lineidx_loaded(): loaded 5024 from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv.before.reorder.lineidx

tsv_io.py:67:   0%|          | 0/5024 [00:00<?, ?it/s]
tsv_io.py:67: 100%|██████████| 5024/5024 [00:00<00:00, 104365.27it/s]
2022-03-17 13:48:55,385.385 2829:tsv_io.py:70   gen_rows(): writing

tsv_io.py:71:   0%|          | 0/5000 [00:00<?, ?it/s]
tsv_io.py:71: 100%|██████████| 5000/5000 [00:00<00:00, 76940.79it/s]
2022-03-17 13:49:12,546.546 2829:qd_common.py:625    cmd_run(): start to cmd run: rsync output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv.speed.vis.txt.tmp --progress
model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt

         32,768  33%    0.00kB/s    0:00:00  
         98,002 100%   62.21MB/s    0:00:00 (xfr#1, to-chk=0/1)
loading annotations into memory...
0:00:00.022380
creating index...
index created!
Loading and preparing results...     
DONE (t=0.01s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 307086 tokens at 1417164.86 tokens per second.
PTBTokenizer tokenized 58623 tokens at 503001.65 tokens per second.
setting up scorers...
computing Bleu score...
{'testlen': 48625, 'reflen': 48062, 'guess': [48625, 43625, 38625, 33625], 'correct': [37509, 21083, 10428, 4978]}
ratio: 1.011714036036765
Bleu_1: 0.771
Bleu_2: 0.611
Bleu_3: 0.465
Bleu_4: 0.349
computing METEOR score...
METEOR: 0.281
computing Rouge score...
ROUGE_L: 0.569
computing CIDEr score...
CIDEr: 1.173
computing SPICE score...
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.nustaq.serialization.FSTClazzInfo (file:/mnt/batch/tasks/shared/LS_root/jobs/vlp32gb/azureml/zhiyuan-pytorch-test_1647405376_841ce8bd/mounts/vig_data/jianfw/data/qd_data/coco_caption/pycocoevalcap/spice/lib/fst-2.47.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of org.nustaq.serialization.FSTClazzInfo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Parsing reference captions
Parsing test captions
Initiating Stanford parsing pipeline
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... 
done [0.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [0.9 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.4 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].
Threads( StanfordCoreNLP ) [3.796 seconds]
Warning: Nashorn engine is planned to be removed from a future JDK release
SPICE evaluation took: 17.66 s
SPICE: 0.216
2022-03-17 13:49:52,293.293 2829:tagger_caption_uni_pipeline_expanding.py:1010   evaluate(): evaluation result: {'Bleu_1': 0.7713933161953569, 'Bleu_2': 0.610571347540759, 'Bleu_3': 0.4651593876611102, 'Bleu_4': 0.349381168045193, 'METEOR': 0.281319367553583, 'ROUGE_L': 0.5685077019460832, 'CIDEr': 1.173289143541926, 'SPICE': 0.21578138800366606}
2022-03-17 13:49:52,293.293 2829:tagger_caption_uni_pipeline_expanding.py:1011   evaluate(): evaluation result saved to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.report
2022-03-17 13:49:52,318.318 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json
2022-03-17 13:49:52,318.318 2829:modeling_utils.py:211 from_pretrained(): Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "image_captioning",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "TIMM_vit",
  "net": "vit_base_patch16_384",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pretrained": true,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
2022-03-17 13:49:53,005.005 2829:modeling_bert.py:529   __init__(): TIMM Split image encoder load from pre-trained: True
2022-03-17 13:49:54,070.070 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:49:55,610.610 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:49:56,259.259 2829:modeling_bert.py:2677   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:49:56,407.407 2829:modeling_bert.py:2688   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:49:57,197.197 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight!  Note that this might be replaced by pre-trained checkpoint later!
2022-03-17 13:49:57,197.197 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode.
2022-03-17 13:49:58,335.335 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:49:58,876.876 2829:checkpoint.py:240       load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt
2022-03-17 13:50:07,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token                                loaded from image_encoder.module.cls_token                         of shape (1, 1, 768)
2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias                                loaded from image_encoder.module.head.bias                         of shape (1000,)
2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight                              loaded from image_encoder.module.head.weight                       of shape (1000, 768)
2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias                    loaded from image_encoder.module.patch_embed.proj.bias             of shape (768,)
2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight                  loaded from image_encoder.module.patch_embed.proj.weight           of shape (768, 3, 16, 16)
2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed                                loaded from image_encoder.module.pos_embed                         of shape (1, 577, 768)
2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias                         loaded from bert.caption_pooler.dense.bias                         of shape (768,)
2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight                       loaded from bert.caption_pooler.dense.weight                       of shape (768, 768)
2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias       loaded from bert.decoder.layer.0.attention.output.dense.bias       of shape (768,)
2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight     loaded from bert.decoder.layer.0.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias           loaded from bert.decoder.layer.0.attention.self.key.bias           of shape (768,)
2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight         loaded from bert.decoder.layer.0.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias         loaded from bert.decoder.layer.0.attention.self.query.bias         of shape (768,)
2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight       loaded from bert.decoder.layer.0.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias         loaded from bert.decoder.layer.0.attention.self.value.bias         of shape (768,)
2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight       loaded from bert.decoder.layer.0.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias           loaded from bert.decoder.layer.0.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight         loaded from bert.decoder.layer.0.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias             loaded from bert.decoder.layer.0.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight           loaded from bert.decoder.layer.0.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias                 loaded from bert.decoder.layer.0.output.dense.bias                 of shape (768,)
2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight               loaded from bert.decoder.layer.0.output.dense.weight               of shape (768, 3072)
2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias       loaded from bert.decoder.layer.1.attention.output.dense.bias       of shape (768,)
2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight     loaded from bert.decoder.layer.1.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias           loaded from bert.decoder.layer.1.attention.self.key.bias           of shape (768,)
2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight         loaded from bert.decoder.layer.1.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias         loaded from bert.decoder.layer.1.attention.self.query.bias         of shape (768,)
2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight       loaded from bert.decoder.layer.1.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias         loaded from bert.decoder.layer.1.attention.self.value.bias         of shape (768,)
2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight       loaded from bert.decoder.layer.1.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias           loaded from bert.decoder.layer.1.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight         loaded from bert.decoder.layer.1.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias             loaded from bert.decoder.layer.1.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight           loaded from bert.decoder.layer.1.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias                 loaded from bert.decoder.layer.1.output.dense.bias                 of shape (768,)
2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight               loaded from bert.decoder.layer.1.output.dense.weight               of shape (768, 3072)
2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias       loaded from bert.decoder.layer.2.attention.output.dense.bias       of shape (768,)
2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight     loaded from bert.decoder.layer.2.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias           loaded from bert.decoder.layer.2.attention.self.key.bias           of shape (768,)
2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight         loaded from bert.decoder.layer.2.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias         loaded from bert.decoder.layer.2.attention.self.query.bias         of shape (768,)
2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight       loaded from bert.decoder.layer.2.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias         loaded from bert.decoder.layer.2.attention.self.value.bias         of shape (768,)
2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight       loaded from bert.decoder.layer.2.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias           loaded from bert.decoder.layer.2.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight         loaded from bert.decoder.layer.2.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias             loaded from bert.decoder.layer.2.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight           loaded from bert.decoder.layer.2.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias                 loaded from bert.decoder.layer.2.output.dense.bias                 of shape (768,)
2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight               loaded from bert.decoder.layer.2.output.dense.weight               of shape (768, 3072)
2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias       loaded from bert.decoder.layer.3.attention.output.dense.bias       of shape (768,)
2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight     loaded from bert.decoder.layer.3.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias           loaded from bert.decoder.layer.3.attention.self.key.bias           of shape (768,)
2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight         loaded from bert.decoder.layer.3.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias         loaded from bert.decoder.layer.3.attention.self.query.bias         of shape (768,)
2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight       loaded from bert.decoder.layer.3.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias         loaded from bert.decoder.layer.3.attention.self.value.bias         of shape (768,)
2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight       loaded from bert.decoder.layer.3.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias           loaded from bert.decoder.layer.3.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight         loaded from bert.decoder.layer.3.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias             loaded from bert.decoder.layer.3.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight           loaded from bert.decoder.layer.3.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias                 loaded from bert.decoder.layer.3.output.dense.bias                 of shape (768,)
2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight               loaded from bert.decoder.layer.3.output.dense.weight               of shape (768, 3072)
2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias                         loaded from bert.embeddings.LayerNorm.bias                         of shape (768,)
2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight                       loaded from bert.embeddings.LayerNorm.weight                       of shape (768,)
2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight             loaded from bert.embeddings.position_embeddings.weight             of shape (512, 768)
2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight           loaded from bert.embeddings.token_type_embeddings.weight           of shape (2, 768)
2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight                 loaded from bert.embeddings.word_embeddings.weight                 of shape (30522, 768)
2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias                   loaded from bert.encoder.blocks.0.attn.proj.bias                   of shape (768,)
2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight                 loaded from bert.encoder.blocks.0.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias                    loaded from bert.encoder.blocks.0.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight                  loaded from bert.encoder.blocks.0.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias                     loaded from bert.encoder.blocks.0.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight                   loaded from bert.encoder.blocks.0.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias                     loaded from bert.encoder.blocks.0.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight                   loaded from bert.encoder.blocks.0.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias                       loaded from bert.encoder.blocks.0.norm1.bias                       of shape (768,)
2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight                     loaded from bert.encoder.blocks.0.norm1.weight                     of shape (768,)
2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias                       loaded from bert.encoder.blocks.0.norm2.bias                       of shape (768,)
2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight                     loaded from bert.encoder.blocks.0.norm2.weight                     of shape (768,)
2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias                   loaded from bert.encoder.blocks.1.attn.proj.bias                   of shape (768,)
2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight                 loaded from bert.encoder.blocks.1.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias                    loaded from bert.encoder.blocks.1.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight                  loaded from bert.encoder.blocks.1.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias                     loaded from bert.encoder.blocks.1.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight                   loaded from bert.encoder.blocks.1.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias                     loaded from bert.encoder.blocks.1.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight                   loaded from bert.encoder.blocks.1.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias                       loaded from bert.encoder.blocks.1.norm1.bias                       of shape (768,)
2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight                     loaded from bert.encoder.blocks.1.norm1.weight                     of shape (768,)
2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias                       loaded from bert.encoder.blocks.1.norm2.bias                       of shape (768,)
2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight                     loaded from bert.encoder.blocks.1.norm2.weight                     of shape (768,)
2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias                  loaded from bert.encoder.blocks.10.attn.proj.bias                  of shape (768,)
2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight                loaded from bert.encoder.blocks.10.attn.proj.weight                of shape (768, 768)
2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias                   loaded from bert.encoder.blocks.10.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight                 loaded from bert.encoder.blocks.10.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias                    loaded from bert.encoder.blocks.10.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight                  loaded from bert.encoder.blocks.10.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias                    loaded from bert.encoder.blocks.10.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight                  loaded from bert.encoder.blocks.10.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias                      loaded from bert.encoder.blocks.10.norm1.bias                      of shape (768,)
2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight                    loaded from bert.encoder.blocks.10.norm1.weight                    of shape (768,)
2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias                      loaded from bert.encoder.blocks.10.norm2.bias                      of shape (768,)
2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight                    loaded from bert.encoder.blocks.10.norm2.weight                    of shape (768,)
2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias                  loaded from bert.encoder.blocks.11.attn.proj.bias                  of shape (768,)
2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight                loaded from bert.encoder.blocks.11.attn.proj.weight                of shape (768, 768)
2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias                   loaded from bert.encoder.blocks.11.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight                 loaded from bert.encoder.blocks.11.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias                    loaded from bert.encoder.blocks.11.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight                  loaded from bert.encoder.blocks.11.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias                    loaded from bert.encoder.blocks.11.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight                  loaded from bert.encoder.blocks.11.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias                      loaded from bert.encoder.blocks.11.norm1.bias                      of shape (768,)
2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight                    loaded from bert.encoder.blocks.11.norm1.weight                    of shape (768,)
2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias                      loaded from bert.encoder.blocks.11.norm2.bias                      of shape (768,)
2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight                    loaded from bert.encoder.blocks.11.norm2.weight                    of shape (768,)
2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias                   loaded from bert.encoder.blocks.2.attn.proj.bias                   of shape (768,)
2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight                 loaded from bert.encoder.blocks.2.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias                    loaded from bert.encoder.blocks.2.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight                  loaded from bert.encoder.blocks.2.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias                     loaded from bert.encoder.blocks.2.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight                   loaded from bert.encoder.blocks.2.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias                     loaded from bert.encoder.blocks.2.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight                   loaded from bert.encoder.blocks.2.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias                       loaded from bert.encoder.blocks.2.norm1.bias                       of shape (768,)
2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight                     loaded from bert.encoder.blocks.2.norm1.weight                     of shape (768,)
2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias                       loaded from bert.encoder.blocks.2.norm2.bias                       of shape (768,)
2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight                     loaded from bert.encoder.blocks.2.norm2.weight                     of shape (768,)
2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias                   loaded from bert.encoder.blocks.3.attn.proj.bias                   of shape (768,)
2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight                 loaded from bert.encoder.blocks.3.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias                    loaded from bert.encoder.blocks.3.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight                  loaded from bert.encoder.blocks.3.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias                     loaded from bert.encoder.blocks.3.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight                   loaded from bert.encoder.blocks.3.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias                     loaded from bert.encoder.blocks.3.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight                   loaded from bert.encoder.blocks.3.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias                       loaded from bert.encoder.blocks.3.norm1.bias                       of shape (768,)
2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight                     loaded from bert.encoder.blocks.3.norm1.weight                     of shape (768,)
2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias                       loaded from bert.encoder.blocks.3.norm2.bias                       of shape (768,)
2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight                     loaded from bert.encoder.blocks.3.norm2.weight                     of shape (768,)
2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias                   loaded from bert.encoder.blocks.4.attn.proj.bias                   of shape (768,)
2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight                 loaded from bert.encoder.blocks.4.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias                    loaded from bert.encoder.blocks.4.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight                  loaded from bert.encoder.blocks.4.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias                     loaded from bert.encoder.blocks.4.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight                   loaded from bert.encoder.blocks.4.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias                     loaded from bert.encoder.blocks.4.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight                   loaded from bert.encoder.blocks.4.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias                       loaded from bert.encoder.blocks.4.norm1.bias                       of shape (768,)
2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight                     loaded from bert.encoder.blocks.4.norm1.weight                     of shape (768,)
2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias                       loaded from bert.encoder.blocks.4.norm2.bias                       of shape (768,)
2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight                     loaded from bert.encoder.blocks.4.norm2.weight                     of shape (768,)
2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias                   loaded from bert.encoder.blocks.5.attn.proj.bias                   of shape (768,)
2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight                 loaded from bert.encoder.blocks.5.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias                    loaded from bert.encoder.blocks.5.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight                  loaded from bert.encoder.blocks.5.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias                     loaded from bert.encoder.blocks.5.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight                   loaded from bert.encoder.blocks.5.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias                     loaded from bert.encoder.blocks.5.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight                   loaded from bert.encoder.blocks.5.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias                       loaded from bert.encoder.blocks.5.norm1.bias                       of shape (768,)
2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight                     loaded from bert.encoder.blocks.5.norm1.weight                     of shape (768,)
2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias                       loaded from bert.encoder.blocks.5.norm2.bias                       of shape (768,)
2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight                     loaded from bert.encoder.blocks.5.norm2.weight                     of shape (768,)
2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias                   loaded from bert.encoder.blocks.6.attn.proj.bias                   of shape (768,)
2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight                 loaded from bert.encoder.blocks.6.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias                    loaded from bert.encoder.blocks.6.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight                  loaded from bert.encoder.blocks.6.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias                     loaded from bert.encoder.blocks.6.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight                   loaded from bert.encoder.blocks.6.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias                     loaded from bert.encoder.blocks.6.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight                   loaded from bert.encoder.blocks.6.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias                       loaded from bert.encoder.blocks.6.norm1.bias                       of shape (768,)
2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight                     loaded from bert.encoder.blocks.6.norm1.weight                     of shape (768,)
2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias                       loaded from bert.encoder.blocks.6.norm2.bias                       of shape (768,)
2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight                     loaded from bert.encoder.blocks.6.norm2.weight                     of shape (768,)
2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias                   loaded from bert.encoder.blocks.7.attn.proj.bias                   of shape (768,)
2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight                 loaded from bert.encoder.blocks.7.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias                    loaded from bert.encoder.blocks.7.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight                  loaded from bert.encoder.blocks.7.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias                     loaded from bert.encoder.blocks.7.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight                   loaded from bert.encoder.blocks.7.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias                     loaded from bert.encoder.blocks.7.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight                   loaded from bert.encoder.blocks.7.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias                       loaded from bert.encoder.blocks.7.norm1.bias                       of shape (768,)
2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight                     loaded from bert.encoder.blocks.7.norm1.weight                     of shape (768,)
2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias                       loaded from bert.encoder.blocks.7.norm2.bias                       of shape (768,)
2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight                     loaded from bert.encoder.blocks.7.norm2.weight                     of shape (768,)
2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias                   loaded from bert.encoder.blocks.8.attn.proj.bias                   of shape (768,)
2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight                 loaded from bert.encoder.blocks.8.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias                    loaded from bert.encoder.blocks.8.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight                  loaded from bert.encoder.blocks.8.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias                     loaded from bert.encoder.blocks.8.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight                   loaded from bert.encoder.blocks.8.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias                     loaded from bert.encoder.blocks.8.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight                   loaded from bert.encoder.blocks.8.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias                       loaded from bert.encoder.blocks.8.norm1.bias                       of shape (768,)
2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight                     loaded from bert.encoder.blocks.8.norm1.weight                     of shape (768,)
2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias                       loaded from bert.encoder.blocks.8.norm2.bias                       of shape (768,)
2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight                     loaded from bert.encoder.blocks.8.norm2.weight                     of shape (768,)
2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias                   loaded from bert.encoder.blocks.9.attn.proj.bias                   of shape (768,)
2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight                 loaded from bert.encoder.blocks.9.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias                    loaded from bert.encoder.blocks.9.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight                  loaded from bert.encoder.blocks.9.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias                     loaded from bert.encoder.blocks.9.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight                   loaded from bert.encoder.blocks.9.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias                     loaded from bert.encoder.blocks.9.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight                   loaded from bert.encoder.blocks.9.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias                       loaded from bert.encoder.blocks.9.norm1.bias                       of shape (768,)
2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight                     loaded from bert.encoder.blocks.9.norm1.weight                     of shape (768,)
2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias                       loaded from bert.encoder.blocks.9.norm2.bias                       of shape (768,)
2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight                     loaded from bert.encoder.blocks.9.norm2.weight                     of shape (768,)
2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias               loaded from bert.encoder.tag_blocks.0.attn.proj.bias               of shape (768,)
2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight             loaded from bert.encoder.tag_blocks.0.attn.proj.weight             of shape (768, 768)
2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias                loaded from bert.encoder.tag_blocks.0.attn.qkv.bias                of shape (2304,)
2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight              loaded from bert.encoder.tag_blocks.0.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias                   loaded from bert.encoder.tag_blocks.0.norm1.bias                   of shape (768,)
2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight                 loaded from bert.encoder.tag_blocks.0.norm1.weight                 of shape (768,)
2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias                   loaded from bert.encoder.tag_blocks.0.norm2.bias                   of shape (768,)
2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight                 loaded from bert.encoder.tag_blocks.0.norm2.weight                 of shape (768,)
2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias               loaded from bert.encoder.tag_blocks.1.attn.proj.bias               of shape (768,)
2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight             loaded from bert.encoder.tag_blocks.1.attn.proj.weight             of shape (768, 768)
2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias                loaded from bert.encoder.tag_blocks.1.attn.qkv.bias                of shape (2304,)
2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight              loaded from bert.encoder.tag_blocks.1.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias                   loaded from bert.encoder.tag_blocks.1.norm1.bias                   of shape (768,)
2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight                 loaded from bert.encoder.tag_blocks.1.norm1.weight                 of shape (768,)
2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias                   loaded from bert.encoder.tag_blocks.1.norm2.bias                   of shape (768,)
2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight                 loaded from bert.encoder.tag_blocks.1.norm2.weight                 of shape (768,)
2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias               loaded from bert.encoder.tag_blocks.2.attn.proj.bias               of shape (768,)
2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight             loaded from bert.encoder.tag_blocks.2.attn.proj.weight             of shape (768, 768)
2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias                loaded from bert.encoder.tag_blocks.2.attn.qkv.bias                of shape (2304,)
2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight              loaded from bert.encoder.tag_blocks.2.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias                   loaded from bert.encoder.tag_blocks.2.norm1.bias                   of shape (768,)
2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight                 loaded from bert.encoder.tag_blocks.2.norm1.weight                 of shape (768,)
2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias                   loaded from bert.encoder.tag_blocks.2.norm2.bias                   of shape (768,)
2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight                 loaded from bert.encoder.tag_blocks.2.norm2.weight                 of shape (768,)
2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias               loaded from bert.encoder.tag_blocks.3.attn.proj.bias               of shape (768,)
2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight             loaded from bert.encoder.tag_blocks.3.attn.proj.weight             of shape (768, 768)
2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias                loaded from bert.encoder.tag_blocks.3.attn.qkv.bias                of shape (2304,)
2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight              loaded from bert.encoder.tag_blocks.3.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias                   loaded from bert.encoder.tag_blocks.3.norm1.bias                   of shape (768,)
2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight                 loaded from bert.encoder.tag_blocks.3.norm1.weight                 of shape (768,)
2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias                   loaded from bert.encoder.tag_blocks.3.norm2.bias                   of shape (768,)
2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight                 loaded from bert.encoder.tag_blocks.3.norm2.weight                 of shape (768,)
2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias                   loaded from bert.extra_embeddings.LayerNorm.bias                   of shape (768,)
2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight                 loaded from bert.extra_embeddings.LayerNorm.weight                 of shape (768,)
2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight       loaded from bert.extra_embeddings.position_embeddings.weight       of shape (512, 768)
2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight     loaded from bert.extra_embeddings.token_type_embeddings.weight     of shape (2, 768)
2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight           loaded from bert.extra_embeddings.word_embeddings.weight           of shape (30522, 768)
2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias                                 loaded from bert.pooler.dense.bias                                 of shape (768,)
2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight                               loaded from bert.pooler.dense.weight                               of shape (768, 768)
2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias                        loaded from bert.tag_logit.predictions.bias                        of shape (30522,)
2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight              loaded from bert.tag_logit.predictions.decoder.weight              of shape (30522, 768)
2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias    loaded from bert.tag_logit.predictions.transform.LayerNorm.bias    of shape (768,)
2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight  loaded from bert.tag_logit.predictions.transform.LayerNorm.weight  of shape (768,)
2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias        loaded from bert.tag_logit.predictions.transform.dense.bias        of shape (768,)
2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight      loaded from bert.tag_logit.predictions.transform.dense.weight      of shape (768, 768)
2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias                                   loaded from cls.predictions.bias                                   of shape (30522,)
2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight                         loaded from cls.predictions.decoder.weight                         of shape (30522, 768)
2022-03-17 13:50:07,196.196 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias               loaded from cls.predictions.transform.LayerNorm.bias               of shape (768,)
2022-03-17 13:50:07,196.196 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight             loaded from cls.predictions.transform.LayerNorm.weight             of shape (768,)
2022-03-17 13:50:07,196.196 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias                   loaded from cls.predictions.transform.dense.bias                   of shape (768,)
2022-03-17 13:50:07,196.196 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight                 loaded from cls.predictions.transform.dense.weight                 of shape (768, 768)
2022-03-17 13:50:07,196.196 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288
2022-03-17 13:50:07,196.196 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = []
2022-03-17 13:50:07,199.199 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1
2022-03-17 13:50:07,360.360 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight']
2022-03-17 13:50:07,683.683 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = <src.qd.data_layer.transform.LoadImage object at 0x7fcbcc664ad0>; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False
2022-03-17 13:50:07,709.709 2829:uni_pipeline.py:509 get_data_loader(): sampler = <src.qd.data_layer.samplers.DistributedSampler object at 0x7fccb0cc9b50>
2022-03-17 13:50:07,709.709 2829:uni_pipeline.py:1000    predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:50:07,781.781 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=<src.qd.data_layer.dataset.ImageIdxTSVDataset object at 0x7fcbcc664dd0>, transform=Compose(
    <src.qd.data_layer.transform.LoadHW object at 0x7fcbcd9b6c50>
    Compose(
    <src.qd.data_layer.transform.LoadImage object at 0x7fcbcc664ad0>
    ImageTransform2Dict(image_transform=Compose(
    <src.qd.data_layer.transform.BGR2RGB object at 0x7fcbcc6642d0>
    ToPILImage()
    Resize(size=384, interpolation=PIL.Image.BICUBIC)
    CenterCrop(size=(384, 384))
    ToTensor()
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
))
)
    LoadLabel(data=TaxCocoCaption, split=test, version=vinvl)
    <src.qd.data_layer.transform.IdentifyTextAB object at 0x7fcba4de5710>
    TransCaptionTensorizer(tensorizer=<src.qd.mask.data.datasets.caption_tensorizer.CaptionTensorizer object at 0x7fcbcd9b5710>, pad_to_max=True, pad_image_to_max=True)
    <src.qd.pipelines.tagger_caption_uni_pipeline_expanding.Tensorizer object at 0x7fccb0cc9c90>
    <src.qd.data_layer.transform.RemoveUselessKeys object at 0x7fccb0cc9c50>
    <src.qd.data_layer.transform.RenameKey object at 0x7fccb0cc9c10>
))

uni_pipeline.py:908:   0%|          | 0/10 [00:00<?, ?it/s]/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1639: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

uni_pipeline.py:908:  10%|█         | 1/10 [00:05<00:53,  5.90s/it]
uni_pipeline.py:908:  20%|██        | 2/10 [00:11<00:43,  5.47s/it]
uni_pipeline.py:908:  30%|███       | 3/10 [00:15<00:36,  5.17s/it]
uni_pipeline.py:908:  40%|████      | 4/10 [00:21<00:32,  5.45s/it]
uni_pipeline.py:908:  50%|█████     | 5/10 [00:26<00:26,  5.23s/it]
uni_pipeline.py:908:  60%|██████    | 6/10 [00:31<00:20,  5.11s/it]
uni_pipeline.py:908:  70%|███████   | 7/10 [00:36<00:15,  5.13s/it]
uni_pipeline.py:908:  80%|████████  | 8/10 [00:41<00:10,  5.03s/it]
uni_pipeline.py:908:  90%|█████████ | 9/10 [00:47<00:05,  5.19s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:51<00:00,  4.88s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:52<00:00,  5.21s/it]
2022-03-17 13:51:00,851.851 2829:qd_common.py:625    cmd_run(): start to cmd run: nvidia-smi
2022-03-17 13:51:01,747.747 2829:qd_common.py:670    cmd_run(): finished the cmd run
2022-03-17 13:51:02,351.351 2829:uni_pipeline.py:1008    predict(): data: 0.0003 (0.0564)  input_to_cuda: 0.0029 (0.0030)  model: 4.8484 (5.0743)  write: 0.0044 (0.0045)
2022-03-17 13:51:17,929.929 2829:qd_common.py:1157 concat_files(): concating 0/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:51:17,929.929 2829:qd_common.py:1157 concat_files(): concating 1/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_1_32.tsv
2022-03-17 13:51:17,930.930 2829:qd_common.py:1157 concat_files(): concating 2/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_2_32.tsv
2022-03-17 13:51:17,931.931 2829:qd_common.py:1157 concat_files(): concating 3/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_3_32.tsv
2022-03-17 13:51:17,931.931 2829:qd_common.py:1157 concat_files(): concating 4/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_4_32.tsv
2022-03-17 13:51:17,932.932 2829:qd_common.py:1157 concat_files(): concating 5/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_5_32.tsv
2022-03-17 13:51:17,932.932 2829:qd_common.py:1157 concat_files(): concating 6/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_6_32.tsv
2022-03-17 13:51:17,933.933 2829:qd_common.py:1157 concat_files(): concating 7/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_7_32.tsv
2022-03-17 13:51:17,933.933 2829:qd_common.py:1157 concat_files(): concating 8/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_8_32.tsv
2022-03-17 13:51:18,004.004 2829:qd_common.py:1157 concat_files(): concating 9/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_9_32.tsv
2022-03-17 13:51:18,076.076 2829:qd_common.py:1157 concat_files(): concating 10/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_10_32.tsv
2022-03-17 13:51:18,147.147 2829:qd_common.py:1157 concat_files(): concating 11/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_11_32.tsv
2022-03-17 13:51:18,217.217 2829:qd_common.py:1157 concat_files(): concating 12/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_12_32.tsv
2022-03-17 13:51:18,288.288 2829:qd_common.py:1157 concat_files(): concating 13/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_13_32.tsv
2022-03-17 13:51:18,364.364 2829:qd_common.py:1157 concat_files(): concating 14/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_14_32.tsv
2022-03-17 13:51:18,435.435 2829:qd_common.py:1157 concat_files(): concating 15/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_15_32.tsv
2022-03-17 13:51:18,505.505 2829:qd_common.py:1157 concat_files(): concating 16/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_16_32.tsv
2022-03-17 13:51:18,575.575 2829:qd_common.py:1157 concat_files(): concating 17/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_17_32.tsv
2022-03-17 13:51:18,646.646 2829:qd_common.py:1157 concat_files(): concating 18/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_18_32.tsv
2022-03-17 13:51:18,717.717 2829:qd_common.py:1157 concat_files(): concating 19/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_19_32.tsv
2022-03-17 13:51:18,791.791 2829:qd_common.py:1157 concat_files(): concating 20/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_20_32.tsv
2022-03-17 13:51:18,861.861 2829:qd_common.py:1157 concat_files(): concating 21/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_21_32.tsv
2022-03-17 13:51:18,933.933 2829:qd_common.py:1157 concat_files(): concating 22/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_22_32.tsv
2022-03-17 13:51:19,003.003 2829:qd_common.py:1157 concat_files(): concating 23/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_23_32.tsv
2022-03-17 13:51:19,075.075 2829:qd_common.py:1157 concat_files(): concating 24/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_24_32.tsv
2022-03-17 13:51:19,146.146 2829:qd_common.py:1157 concat_files(): concating 25/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_25_32.tsv
2022-03-17 13:51:19,219.219 2829:qd_common.py:1157 concat_files(): concating 26/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_26_32.tsv
2022-03-17 13:51:19,289.289 2829:qd_common.py:1157 concat_files(): concating 27/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_27_32.tsv
2022-03-17 13:51:19,361.361 2829:qd_common.py:1157 concat_files(): concating 28/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_28_32.tsv
2022-03-17 13:51:19,431.431 2829:qd_common.py:1157 concat_files(): concating 29/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_29_32.tsv
2022-03-17 13:51:19,502.502 2829:qd_common.py:1157 concat_files(): concating 30/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_30_32.tsv
2022-03-17 13:51:19,574.574 2829:qd_common.py:1157 concat_files(): concating 31/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_31_32.tsv
2022-03-17 13:51:21,852.852 2829:tsv_io.py:66 reorder_tsv_keys(): loading keys in input
2022-03-17 13:51:21,878.878 2829:tsv_io.py:328 _ensure_lineidx_loaded(): loaded 5024 from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv.before.reorder.lineidx

tsv_io.py:67:   0%|          | 0/5024 [00:00<?, ?it/s]
tsv_io.py:67: 100%|██████████| 5024/5024 [00:00<00:00, 101514.04it/s]
2022-03-17 13:51:22,001.001 2829:tsv_io.py:70   gen_rows(): writing

tsv_io.py:71:   0%|          | 0/5000 [00:00<?, ?it/s]
tsv_io.py:71: 100%|██████████| 5000/5000 [00:00<00:00, 71739.69it/s]
2022-03-17 13:51:39,015.015 2829:qd_common.py:625    cmd_run(): start to cmd run: rsync output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv.speed.vis.txt.tmp --progress
model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt

         32,768  33%    0.00kB/s    0:00:00  
         98,002 100%   62.21MB/s    0:00:00 (xfr#1, to-chk=0/1)
loading annotations into memory...
0:00:00.022404
creating index...
index created!
Loading and preparing results...     
DONE (t=0.01s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 307086 tokens at 1420896.95 tokens per second.
PTBTokenizer tokenized 59623 tokens at 505128.20 tokens per second.
setting up scorers...
computing Bleu score...
{'testlen': 49306, 'reflen': 48511, 'guess': [49306, 44306, 39306, 34306], 'correct': [37711, 21260, 10688, 5165]}
ratio: 1.0163880357032218
Bleu_1: 0.765
Bleu_2: 0.606
Bleu_3: 0.464
Bleu_4: 0.350
computing METEOR score...
METEOR: 0.284
computing Rouge score...
ROUGE_L: 0.571
computing CIDEr score...
CIDEr: 1.181
computing SPICE score...
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.nustaq.serialization.FSTClazzInfo (file:/mnt/batch/tasks/shared/LS_root/jobs/vlp32gb/azureml/zhiyuan-pytorch-test_1647405376_841ce8bd/mounts/vig_data/jianfw/data/qd_data/coco_caption/pycocoevalcap/spice/lib/fst-2.47.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of org.nustaq.serialization.FSTClazzInfo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Parsing reference captions
Parsing test captions
Initiating Stanford parsing pipeline
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... 
done [0.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [0.9 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.4 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].
Threads( StanfordCoreNLP ) [3.880 seconds]
Warning: Nashorn engine is planned to be removed from a future JDK release
SPICE evaluation took: 17.88 s
SPICE: 0.217
2022-03-17 13:52:19,601.601 2829:tagger_caption_uni_pipeline_expanding.py:1010   evaluate(): evaluation result: {'Bleu_1': 0.7648359226057526, 'Bleu_2': 0.6058072930331092, 'Bleu_3': 0.4638407105543257, 'Bleu_4': 0.35010773685139174, 'METEOR': 0.2842790109986589, 'ROUGE_L': 0.5710172260822924, 'CIDEr': 1.1806901338514713, 'SPICE': 0.21655156431099096}
2022-03-17 13:52:19,601.601 2829:tagger_caption_uni_pipeline_expanding.py:1011   evaluate(): evaluation result saved to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.report
2022-03-17 13:52:19,626.626 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json
2022-03-17 13:52:19,627.627 2829:modeling_utils.py:211 from_pretrained(): Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "image_captioning",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "TIMM_vit",
  "net": "vit_base_patch16_384",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pretrained": true,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
2022-03-17 13:52:20,314.314 2829:modeling_bert.py:529   __init__(): TIMM Split image encoder load from pre-trained: True
2022-03-17 13:52:21,381.381 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:52:22,920.920 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:52:23,568.568 2829:modeling_bert.py:2677   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:52:23,716.716 2829:modeling_bert.py:2688   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:52:24,506.506 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight!  Note that this might be replaced by pre-trained checkpoint later!
2022-03-17 13:52:24,506.506 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode.
2022-03-17 13:52:25,639.639 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:52:26,170.170 2829:checkpoint.py:240       load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt
2022-03-17 13:52:34,211.211 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token                                loaded from image_encoder.module.cls_token                         of shape (1, 1, 768)
2022-03-17 13:52:34,211.211 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias                                loaded from image_encoder.module.head.bias                         of shape (1000,)
2022-03-17 13:52:34,211.211 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight                              loaded from image_encoder.module.head.weight                       of shape (1000, 768)
2022-03-17 13:52:34,211.211 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias                    loaded from image_encoder.module.patch_embed.proj.bias             of shape (768,)
2022-03-17 13:52:34,211.211 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight                  loaded from image_encoder.module.patch_embed.proj.weight           of shape (768, 3, 16, 16)
2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed                                loaded from image_encoder.module.pos_embed                         of shape (1, 577, 768)
2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias                         loaded from bert.caption_pooler.dense.bias                         of shape (768,)
2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight                       loaded from bert.caption_pooler.dense.weight                       of shape (768, 768)
2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias       loaded from bert.decoder.layer.0.attention.output.dense.bias       of shape (768,)
2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight     loaded from bert.decoder.layer.0.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias           loaded from bert.decoder.layer.0.attention.self.key.bias           of shape (768,)
2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight         loaded from bert.decoder.layer.0.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias         loaded from bert.decoder.layer.0.attention.self.query.bias         of shape (768,)
2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight       loaded from bert.decoder.layer.0.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias         loaded from bert.decoder.layer.0.attention.self.value.bias         of shape (768,)
2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight       loaded from bert.decoder.layer.0.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias           loaded from bert.decoder.layer.0.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight         loaded from bert.decoder.layer.0.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias             loaded from bert.decoder.layer.0.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight           loaded from bert.decoder.layer.0.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias                 loaded from bert.decoder.layer.0.output.dense.bias                 of shape (768,)
2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight               loaded from bert.decoder.layer.0.output.dense.weight               of shape (768, 3072)
2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias       loaded from bert.decoder.layer.1.attention.output.dense.bias       of shape (768,)
2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight     loaded from bert.decoder.layer.1.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias           loaded from bert.decoder.layer.1.attention.self.key.bias           of shape (768,)
2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight         loaded from bert.decoder.layer.1.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias         loaded from bert.decoder.layer.1.attention.self.query.bias         of shape (768,)
2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight       loaded from bert.decoder.layer.1.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias         loaded from bert.decoder.layer.1.attention.self.value.bias         of shape (768,)
2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight       loaded from bert.decoder.layer.1.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias           loaded from bert.decoder.layer.1.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight         loaded from bert.decoder.layer.1.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias             loaded from bert.decoder.layer.1.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight           loaded from bert.decoder.layer.1.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias                 loaded from bert.decoder.layer.1.output.dense.bias                 of shape (768,)
2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight               loaded from bert.decoder.layer.1.output.dense.weight               of shape (768, 3072)
2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias       loaded from bert.decoder.layer.2.attention.output.dense.bias       of shape (768,)
2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight     loaded from bert.decoder.layer.2.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias           loaded from bert.decoder.layer.2.attention.self.key.bias           of shape (768,)
2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight         loaded from bert.decoder.layer.2.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias         loaded from bert.decoder.layer.2.attention.self.query.bias         of shape (768,)
2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight       loaded from bert.decoder.layer.2.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias         loaded from bert.decoder.layer.2.attention.self.value.bias         of shape (768,)
2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight       loaded from bert.decoder.layer.2.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias           loaded from bert.decoder.layer.2.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight         loaded from bert.decoder.layer.2.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias             loaded from bert.decoder.layer.2.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight           loaded from bert.decoder.layer.2.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias                 loaded from bert.decoder.layer.2.output.dense.bias                 of shape (768,)
2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight               loaded from bert.decoder.layer.2.output.dense.weight               of shape (768, 3072)
2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias       loaded from bert.decoder.layer.3.attention.output.dense.bias       of shape (768,)
2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight     loaded from bert.decoder.layer.3.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias           loaded from bert.decoder.layer.3.attention.self.key.bias           of shape (768,)
2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight         loaded from bert.decoder.layer.3.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias         loaded from bert.decoder.layer.3.attention.self.query.bias         of shape (768,)
2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight       loaded from bert.decoder.layer.3.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias         loaded from bert.decoder.layer.3.attention.self.value.bias         of shape (768,)
2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight       loaded from bert.decoder.layer.3.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias           loaded from bert.decoder.layer.3.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight         loaded from bert.decoder.layer.3.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias             loaded from bert.decoder.layer.3.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight           loaded from bert.decoder.layer.3.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias                 loaded from bert.decoder.layer.3.output.dense.bias                 of shape (768,)
2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight               loaded from bert.decoder.layer.3.output.dense.weight               of shape (768, 3072)
2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias                         loaded from bert.embeddings.LayerNorm.bias                         of shape (768,)
2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight                       loaded from bert.embeddings.LayerNorm.weight                       of shape (768,)
2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight             loaded from bert.embeddings.position_embeddings.weight             of shape (512, 768)
2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight           loaded from bert.embeddings.token_type_embeddings.weight           of shape (2, 768)
2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight                 loaded from bert.embeddings.word_embeddings.weight                 of shape (30522, 768)
2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias                   loaded from bert.encoder.blocks.0.attn.proj.bias                   of shape (768,)
2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight                 loaded from bert.encoder.blocks.0.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias                    loaded from bert.encoder.blocks.0.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight                  loaded from bert.encoder.blocks.0.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias                     loaded from bert.encoder.blocks.0.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight                   loaded from bert.encoder.blocks.0.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias                     loaded from bert.encoder.blocks.0.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight                   loaded from bert.encoder.blocks.0.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias                       loaded from bert.encoder.blocks.0.norm1.bias                       of shape (768,)
2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight                     loaded from bert.encoder.blocks.0.norm1.weight                     of shape (768,)
2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias                       loaded from bert.encoder.blocks.0.norm2.bias                       of shape (768,)
2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight                     loaded from bert.encoder.blocks.0.norm2.weight                     of shape (768,)
2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias                   loaded from bert.encoder.blocks.1.attn.proj.bias                   of shape (768,)
2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight                 loaded from bert.encoder.blocks.1.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias                    loaded from bert.encoder.blocks.1.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight                  loaded from bert.encoder.blocks.1.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias                     loaded from bert.encoder.blocks.1.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight                   loaded from bert.encoder.blocks.1.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias                     loaded from bert.encoder.blocks.1.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight                   loaded from bert.encoder.blocks.1.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias                       loaded from bert.encoder.blocks.1.norm1.bias                       of shape (768,)
2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight                     loaded from bert.encoder.blocks.1.norm1.weight                     of shape (768,)
2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias                       loaded from bert.encoder.blocks.1.norm2.bias                       of shape (768,)
2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight                     loaded from bert.encoder.blocks.1.norm2.weight                     of shape (768,)
2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias                  loaded from bert.encoder.blocks.10.attn.proj.bias                  of shape (768,)
2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight                loaded from bert.encoder.blocks.10.attn.proj.weight                of shape (768, 768)
2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias                   loaded from bert.encoder.blocks.10.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight                 loaded from bert.encoder.blocks.10.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias                    loaded from bert.encoder.blocks.10.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight                  loaded from bert.encoder.blocks.10.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias                    loaded from bert.encoder.blocks.10.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight                  loaded from bert.encoder.blocks.10.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias                      loaded from bert.encoder.blocks.10.norm1.bias                      of shape (768,)
2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight                    loaded from bert.encoder.blocks.10.norm1.weight                    of shape (768,)
2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias                      loaded from bert.encoder.blocks.10.norm2.bias                      of shape (768,)
2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight                    loaded from bert.encoder.blocks.10.norm2.weight                    of shape (768,)
2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias                  loaded from bert.encoder.blocks.11.attn.proj.bias                  of shape (768,)
2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight                loaded from bert.encoder.blocks.11.attn.proj.weight                of shape (768, 768)
2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias                   loaded from bert.encoder.blocks.11.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight                 loaded from bert.encoder.blocks.11.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias                    loaded from bert.encoder.blocks.11.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight                  loaded from bert.encoder.blocks.11.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias                    loaded from bert.encoder.blocks.11.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight                  loaded from bert.encoder.blocks.11.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias                      loaded from bert.encoder.blocks.11.norm1.bias                      of shape (768,)
2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight                    loaded from bert.encoder.blocks.11.norm1.weight                    of shape (768,)
2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias                      loaded from bert.encoder.blocks.11.norm2.bias                      of shape (768,)
2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight                    loaded from bert.encoder.blocks.11.norm2.weight                    of shape (768,)
2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias                   loaded from bert.encoder.blocks.2.attn.proj.bias                   of shape (768,)
2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight                 loaded from bert.encoder.blocks.2.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias                    loaded from bert.encoder.blocks.2.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight                  loaded from bert.encoder.blocks.2.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias                     loaded from bert.encoder.blocks.2.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight                   loaded from bert.encoder.blocks.2.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias                     loaded from bert.encoder.blocks.2.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight                   loaded from bert.encoder.blocks.2.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias                       loaded from bert.encoder.blocks.2.norm1.bias                       of shape (768,)
2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight                     loaded from bert.encoder.blocks.2.norm1.weight                     of shape (768,)
2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias                       loaded from bert.encoder.blocks.2.norm2.bias                       of shape (768,)
2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight                     loaded from bert.encoder.blocks.2.norm2.weight                     of shape (768,)
2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias                   loaded from bert.encoder.blocks.3.attn.proj.bias                   of shape (768,)
2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight                 loaded from bert.encoder.blocks.3.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias                    loaded from bert.encoder.blocks.3.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight                  loaded from bert.encoder.blocks.3.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias                     loaded from bert.encoder.blocks.3.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight                   loaded from bert.encoder.blocks.3.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias                     loaded from bert.encoder.blocks.3.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight                   loaded from bert.encoder.blocks.3.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias                       loaded from bert.encoder.blocks.3.norm1.bias                       of shape (768,)
2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight                     loaded from bert.encoder.blocks.3.norm1.weight                     of shape (768,)
2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias                       loaded from bert.encoder.blocks.3.norm2.bias                       of shape (768,)
2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight                     loaded from bert.encoder.blocks.3.norm2.weight                     of shape (768,)
2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias                   loaded from bert.encoder.blocks.4.attn.proj.bias                   of shape (768,)
2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight                 loaded from bert.encoder.blocks.4.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias                    loaded from bert.encoder.blocks.4.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight                  loaded from bert.encoder.blocks.4.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias                     loaded from bert.encoder.blocks.4.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight                   loaded from bert.encoder.blocks.4.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias                     loaded from bert.encoder.blocks.4.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight                   loaded from bert.encoder.blocks.4.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias                       loaded from bert.encoder.blocks.4.norm1.bias                       of shape (768,)
2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight                     loaded from bert.encoder.blocks.4.norm1.weight                     of shape (768,)
2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias                       loaded from bert.encoder.blocks.4.norm2.bias                       of shape (768,)
2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight                     loaded from bert.encoder.blocks.4.norm2.weight                     of shape (768,)
2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias                   loaded from bert.encoder.blocks.5.attn.proj.bias                   of shape (768,)
2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight                 loaded from bert.encoder.blocks.5.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias                    loaded from bert.encoder.blocks.5.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight                  loaded from bert.encoder.blocks.5.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias                     loaded from bert.encoder.blocks.5.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight                   loaded from bert.encoder.blocks.5.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias                     loaded from bert.encoder.blocks.5.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight                   loaded from bert.encoder.blocks.5.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias                       loaded from bert.encoder.blocks.5.norm1.bias                       of shape (768,)
2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight                     loaded from bert.encoder.blocks.5.norm1.weight                     of shape (768,)
2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias                       loaded from bert.encoder.blocks.5.norm2.bias                       of shape (768,)
2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight                     loaded from bert.encoder.blocks.5.norm2.weight                     of shape (768,)
2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias                   loaded from bert.encoder.blocks.6.attn.proj.bias                   of shape (768,)
2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight                 loaded from bert.encoder.blocks.6.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias                    loaded from bert.encoder.blocks.6.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight                  loaded from bert.encoder.blocks.6.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias                     loaded from bert.encoder.blocks.6.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight                   loaded from bert.encoder.blocks.6.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias                     loaded from bert.encoder.blocks.6.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight                   loaded from bert.encoder.blocks.6.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias                       loaded from bert.encoder.blocks.6.norm1.bias                       of shape (768,)
2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight                     loaded from bert.encoder.blocks.6.norm1.weight                     of shape (768,)
2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias                       loaded from bert.encoder.blocks.6.norm2.bias                       of shape (768,)
2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight                     loaded from bert.encoder.blocks.6.norm2.weight                     of shape (768,)
2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias                   loaded from bert.encoder.blocks.7.attn.proj.bias                   of shape (768,)
2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight                 loaded from bert.encoder.blocks.7.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias                    loaded from bert.encoder.blocks.7.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight                  loaded from bert.encoder.blocks.7.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias                     loaded from bert.encoder.blocks.7.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight                   loaded from bert.encoder.blocks.7.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias                     loaded from bert.encoder.blocks.7.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight                   loaded from bert.encoder.blocks.7.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias                       loaded from bert.encoder.blocks.7.norm1.bias                       of shape (768,)
2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight                     loaded from bert.encoder.blocks.7.norm1.weight                     of shape (768,)
2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias                       loaded from bert.encoder.blocks.7.norm2.bias                       of shape (768,)
2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight                     loaded from bert.encoder.blocks.7.norm2.weight                     of shape (768,)
2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias                   loaded from bert.encoder.blocks.8.attn.proj.bias                   of shape (768,)
2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight                 loaded from bert.encoder.blocks.8.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias                    loaded from bert.encoder.blocks.8.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight                  loaded from bert.encoder.blocks.8.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias                     loaded from bert.encoder.blocks.8.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight                   loaded from bert.encoder.blocks.8.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias                     loaded from bert.encoder.blocks.8.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight                   loaded from bert.encoder.blocks.8.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias                       loaded from bert.encoder.blocks.8.norm1.bias                       of shape (768,)
2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight                     loaded from bert.encoder.blocks.8.norm1.weight                     of shape (768,)
2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias                       loaded from bert.encoder.blocks.8.norm2.bias                       of shape (768,)
2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight                     loaded from bert.encoder.blocks.8.norm2.weight                     of shape (768,)
2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias                   loaded from bert.encoder.blocks.9.attn.proj.bias                   of shape (768,)
2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight                 loaded from bert.encoder.blocks.9.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias                    loaded from bert.encoder.blocks.9.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight                  loaded from bert.encoder.blocks.9.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias                     loaded from bert.encoder.blocks.9.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight                   loaded from bert.encoder.blocks.9.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias                     loaded from bert.encoder.blocks.9.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight                   loaded from bert.encoder.blocks.9.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias                       loaded from bert.encoder.blocks.9.norm1.bias                       of shape (768,)
2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight                     loaded from bert.encoder.blocks.9.norm1.weight                     of shape (768,)
2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias                       loaded from bert.encoder.blocks.9.norm2.bias                       of shape (768,)
2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight                     loaded from bert.encoder.blocks.9.norm2.weight                     of shape (768,)
2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias               loaded from bert.encoder.tag_blocks.0.attn.proj.bias               of shape (768,)
2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight             loaded from bert.encoder.tag_blocks.0.attn.proj.weight             of shape (768, 768)
2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias                loaded from bert.encoder.tag_blocks.0.attn.qkv.bias                of shape (2304,)
2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight              loaded from bert.encoder.tag_blocks.0.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias                   loaded from bert.encoder.tag_blocks.0.norm1.bias                   of shape (768,)
2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight                 loaded from bert.encoder.tag_blocks.0.norm1.weight                 of shape (768,)
2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias                   loaded from bert.encoder.tag_blocks.0.norm2.bias                   of shape (768,)
2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight                 loaded from bert.encoder.tag_blocks.0.norm2.weight                 of shape (768,)
2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias               loaded from bert.encoder.tag_blocks.1.attn.proj.bias               of shape (768,)
2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight             loaded from bert.encoder.tag_blocks.1.attn.proj.weight             of shape (768, 768)
2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias                loaded from bert.encoder.tag_blocks.1.attn.qkv.bias                of shape (2304,)
2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight              loaded from bert.encoder.tag_blocks.1.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias                   loaded from bert.encoder.tag_blocks.1.norm1.bias                   of shape (768,)
2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight                 loaded from bert.encoder.tag_blocks.1.norm1.weight                 of shape (768,)
2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias                   loaded from bert.encoder.tag_blocks.1.norm2.bias                   of shape (768,)
2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight                 loaded from bert.encoder.tag_blocks.1.norm2.weight                 of shape (768,)
2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias               loaded from bert.encoder.tag_blocks.2.attn.proj.bias               of shape (768,)
2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight             loaded from bert.encoder.tag_blocks.2.attn.proj.weight             of shape (768, 768)
2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias                loaded from bert.encoder.tag_blocks.2.attn.qkv.bias                of shape (2304,)
2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight              loaded from bert.encoder.tag_blocks.2.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias                   loaded from bert.encoder.tag_blocks.2.norm1.bias                   of shape (768,)
2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight                 loaded from bert.encoder.tag_blocks.2.norm1.weight                 of shape (768,)
2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias                   loaded from bert.encoder.tag_blocks.2.norm2.bias                   of shape (768,)
2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight                 loaded from bert.encoder.tag_blocks.2.norm2.weight                 of shape (768,)
2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias               loaded from bert.encoder.tag_blocks.3.attn.proj.bias               of shape (768,)
2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight             loaded from bert.encoder.tag_blocks.3.attn.proj.weight             of shape (768, 768)
2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias                loaded from bert.encoder.tag_blocks.3.attn.qkv.bias                of shape (2304,)
2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight              loaded from bert.encoder.tag_blocks.3.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias                   loaded from bert.encoder.tag_blocks.3.norm1.bias                   of shape (768,)
2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight                 loaded from bert.encoder.tag_blocks.3.norm1.weight                 of shape (768,)
2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias                   loaded from bert.encoder.tag_blocks.3.norm2.bias                   of shape (768,)
2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight                 loaded from bert.encoder.tag_blocks.3.norm2.weight                 of shape (768,)
2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias                   loaded from bert.extra_embeddings.LayerNorm.bias                   of shape (768,)
2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight                 loaded from bert.extra_embeddings.LayerNorm.weight                 of shape (768,)
2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight       loaded from bert.extra_embeddings.position_embeddings.weight       of shape (512, 768)
2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight     loaded from bert.extra_embeddings.token_type_embeddings.weight     of shape (2, 768)
2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight           loaded from bert.extra_embeddings.word_embeddings.weight           of shape (30522, 768)
2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias                                 loaded from bert.pooler.dense.bias                                 of shape (768,)
2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight                               loaded from bert.pooler.dense.weight                               of shape (768, 768)
2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias                        loaded from bert.tag_logit.predictions.bias                        of shape (30522,)
2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight              loaded from bert.tag_logit.predictions.decoder.weight              of shape (30522, 768)
2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias    loaded from bert.tag_logit.predictions.transform.LayerNorm.bias    of shape (768,)
2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight  loaded from bert.tag_logit.predictions.transform.LayerNorm.weight  of shape (768,)
2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias        loaded from bert.tag_logit.predictions.transform.dense.bias        of shape (768,)
2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight      loaded from bert.tag_logit.predictions.transform.dense.weight      of shape (768, 768)
2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias                                   loaded from cls.predictions.bias                                   of shape (30522,)
2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight                         loaded from cls.predictions.decoder.weight                         of shape (30522, 768)
2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias               loaded from cls.predictions.transform.LayerNorm.bias               of shape (768,)
2022-03-17 13:52:34,233.233 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight             loaded from cls.predictions.transform.LayerNorm.weight             of shape (768,)
2022-03-17 13:52:34,233.233 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias                   loaded from cls.predictions.transform.dense.bias                   of shape (768,)
2022-03-17 13:52:34,233.233 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight                 loaded from cls.predictions.transform.dense.weight                 of shape (768, 768)
2022-03-17 13:52:34,233.233 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288
2022-03-17 13:52:34,233.233 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = []
2022-03-17 13:52:34,236.236 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1
2022-03-17 13:52:34,404.404 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight']
2022-03-17 13:52:34,732.732 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = <src.qd.data_layer.transform.LoadImage object at 0x7fcbc0433ed0>; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False
2022-03-17 13:52:34,757.757 2829:uni_pipeline.py:509 get_data_loader(): sampler = <src.qd.data_layer.samplers.DistributedSampler object at 0x7fccb0ca9790>
2022-03-17 13:52:34,758.758 2829:uni_pipeline.py:1000    predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:52:34,830.830 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=<src.qd.data_layer.dataset.ImageIdxTSVDataset object at 0x7fcbc04516d0>, transform=Compose(
    <src.qd.data_layer.transform.LoadHW object at 0x7fccae752790>
    Compose(
    <src.qd.data_layer.transform.LoadImage object at 0x7fcbc0433ed0>
    ImageTransform2Dict(image_transform=Compose(
    <src.qd.data_layer.transform.BGR2RGB object at 0x7fcbc0451950>
    ToPILImage()
    Resize(size=384, interpolation=PIL.Image.BICUBIC)
    CenterCrop(size=(384, 384))
    ToTensor()
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
))
)
    LoadLabel(data=TaxCocoCaption, split=test, version=vinvl)
    <src.qd.data_layer.transform.IdentifyTextAB object at 0x7fccb66e2250>
    TransCaptionTensorizer(tensorizer=<src.qd.mask.data.datasets.caption_tensorizer.CaptionTensorizer object at 0x7fcbcd9b5710>, pad_to_max=True, pad_image_to_max=True)
    <src.qd.pipelines.tagger_caption_uni_pipeline_expanding.Tensorizer object at 0x7fccb0ca98d0>
    <src.qd.data_layer.transform.RemoveUselessKeys object at 0x7fccb0ca9890>
    <src.qd.data_layer.transform.RenameKey object at 0x7fccb0ca9850>
))

uni_pipeline.py:908:   0%|          | 0/10 [00:00<?, ?it/s]/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1639: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

uni_pipeline.py:908:  10%|█         | 1/10 [00:05<00:46,  5.20s/it]
uni_pipeline.py:908:  20%|██        | 2/10 [00:10<00:43,  5.38s/it]
uni_pipeline.py:908:  30%|███       | 3/10 [00:15<00:35,  5.13s/it]
uni_pipeline.py:908:  40%|████      | 4/10 [00:21<00:31,  5.29s/it]
uni_pipeline.py:908:  50%|█████     | 5/10 [00:26<00:27,  5.51s/it]
uni_pipeline.py:908:  60%|██████    | 6/10 [00:31<00:21,  5.28s/it]
uni_pipeline.py:908:  70%|███████   | 7/10 [00:37<00:15,  5.25s/it]
uni_pipeline.py:908:  80%|████████  | 8/10 [00:42<00:10,  5.23s/it]
uni_pipeline.py:908:  90%|█████████ | 9/10 [00:47<00:05,  5.21s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:51<00:00,  4.89s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:52<00:00,  5.26s/it]
2022-03-17 13:53:28,300.300 2829:qd_common.py:625    cmd_run(): start to cmd run: nvidia-smi
2022-03-17 13:53:29,246.246 2829:qd_common.py:670    cmd_run(): finished the cmd run
2022-03-17 13:53:29,774.774 2829:uni_pipeline.py:1008    predict(): data: 0.0003 (0.0548)  input_to_cuda: 0.0029 (0.0032)  model: 5.1713 (5.1101)  write: 0.0046 (0.0046)
2022-03-17 13:53:46,420.420 2829:qd_common.py:1157 concat_files(): concating 0/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:53:46,421.421 2829:qd_common.py:1157 concat_files(): concating 1/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_1_32.tsv
2022-03-17 13:53:46,421.421 2829:qd_common.py:1157 concat_files(): concating 2/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_2_32.tsv
2022-03-17 13:53:46,422.422 2829:qd_common.py:1157 concat_files(): concating 3/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_3_32.tsv
2022-03-17 13:53:46,422.422 2829:qd_common.py:1157 concat_files(): concating 4/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_4_32.tsv
2022-03-17 13:53:46,423.423 2829:qd_common.py:1157 concat_files(): concating 5/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_5_32.tsv
2022-03-17 13:53:46,423.423 2829:qd_common.py:1157 concat_files(): concating 6/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_6_32.tsv
2022-03-17 13:53:46,424.424 2829:qd_common.py:1157 concat_files(): concating 7/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_7_32.tsv
2022-03-17 13:53:46,425.425 2829:qd_common.py:1157 concat_files(): concating 8/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_8_32.tsv
2022-03-17 13:53:46,496.496 2829:qd_common.py:1157 concat_files(): concating 9/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_9_32.tsv
2022-03-17 13:53:46,566.566 2829:qd_common.py:1157 concat_files(): concating 10/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_10_32.tsv
2022-03-17 13:53:46,638.638 2829:qd_common.py:1157 concat_files(): concating 11/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_11_32.tsv
2022-03-17 13:53:46,709.709 2829:qd_common.py:1157 concat_files(): concating 12/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_12_32.tsv
2022-03-17 13:53:46,782.782 2829:qd_common.py:1157 concat_files(): concating 13/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_13_32.tsv
2022-03-17 13:53:46,853.853 2829:qd_common.py:1157 concat_files(): concating 14/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_14_32.tsv
2022-03-17 13:53:46,925.925 2829:qd_common.py:1157 concat_files(): concating 15/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_15_32.tsv
2022-03-17 13:53:46,998.998 2829:qd_common.py:1157 concat_files(): concating 16/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_16_32.tsv
2022-03-17 13:53:47,069.069 2829:qd_common.py:1157 concat_files(): concating 17/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_17_32.tsv
2022-03-17 13:53:47,141.141 2829:qd_common.py:1157 concat_files(): concating 18/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_18_32.tsv
2022-03-17 13:53:47,213.213 2829:qd_common.py:1157 concat_files(): concating 19/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_19_32.tsv
2022-03-17 13:53:47,285.285 2829:qd_common.py:1157 concat_files(): concating 20/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_20_32.tsv
2022-03-17 13:53:47,356.356 2829:qd_common.py:1157 concat_files(): concating 21/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_21_32.tsv
2022-03-17 13:53:47,426.426 2829:qd_common.py:1157 concat_files(): concating 22/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_22_32.tsv
2022-03-17 13:53:47,497.497 2829:qd_common.py:1157 concat_files(): concating 23/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_23_32.tsv
2022-03-17 13:53:47,569.569 2829:qd_common.py:1157 concat_files(): concating 24/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_24_32.tsv
2022-03-17 13:53:47,641.641 2829:qd_common.py:1157 concat_files(): concating 25/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_25_32.tsv
2022-03-17 13:53:47,713.713 2829:qd_common.py:1157 concat_files(): concating 26/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_26_32.tsv
2022-03-17 13:53:47,784.784 2829:qd_common.py:1157 concat_files(): concating 27/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_27_32.tsv
2022-03-17 13:53:47,854.854 2829:qd_common.py:1157 concat_files(): concating 28/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_28_32.tsv
2022-03-17 13:53:47,925.925 2829:qd_common.py:1157 concat_files(): concating 29/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_29_32.tsv
2022-03-17 13:53:48,004.004 2829:qd_common.py:1157 concat_files(): concating 30/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_30_32.tsv
2022-03-17 13:53:48,075.075 2829:qd_common.py:1157 concat_files(): concating 31/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_31_32.tsv
2022-03-17 13:53:50,357.357 2829:tsv_io.py:66 reorder_tsv_keys(): loading keys in input
2022-03-17 13:53:50,384.384 2829:tsv_io.py:328 _ensure_lineidx_loaded(): loaded 5024 from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv.before.reorder.lineidx

tsv_io.py:67:   0%|          | 0/5024 [00:00<?, ?it/s]
tsv_io.py:67: 100%|██████████| 5024/5024 [00:00<00:00, 103120.12it/s]
2022-03-17 13:53:50,505.505 2829:tsv_io.py:70   gen_rows(): writing

tsv_io.py:71:   0%|          | 0/5000 [00:00<?, ?it/s]
tsv_io.py:71: 100%|██████████| 5000/5000 [00:00<00:00, 71903.75it/s]
2022-03-17 13:54:07,661.661 2829:qd_common.py:625    cmd_run(): start to cmd run: rsync output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv.speed.vis.txt.tmp --progress
model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt

         32,768  33%    0.00kB/s    0:00:00  
         98,002 100%   62.21MB/s    0:00:00 (xfr#1, to-chk=0/1)
loading annotations into memory...
0:00:00.023324
creating index...
index created!
Loading and preparing results...     
DONE (t=0.01s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 307086 tokens at 1407817.39 tokens per second.
PTBTokenizer tokenized 59518 tokens at 502512.73 tokens per second.
setting up scorers...
computing Bleu score...
{'testlen': 49393, 'reflen': 48530, 'guess': [49393, 44393, 39393, 34393], 'correct': [37905, 21211, 10520, 5048]}
ratio: 1.0177828147537396
Bleu_1: 0.767
Bleu_2: 0.606
Bleu_3: 0.461
Bleu_4: 0.346
computing METEOR score...
METEOR: 0.283
computing Rouge score...
ROUGE_L: 0.570
computing CIDEr score...
CIDEr: 1.175
computing SPICE score...
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.nustaq.serialization.FSTClazzInfo (file:/mnt/batch/tasks/shared/LS_root/jobs/vlp32gb/azureml/zhiyuan-pytorch-test_1647405376_841ce8bd/mounts/vig_data/jianfw/data/qd_data/coco_caption/pycocoevalcap/spice/lib/fst-2.47.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of org.nustaq.serialization.FSTClazzInfo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Parsing reference captions
Parsing test captions
Initiating Stanford parsing pipeline
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... 
done [0.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [0.9 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.4 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].
Threads( StanfordCoreNLP ) [3.477 seconds]
Warning: Nashorn engine is planned to be removed from a future JDK release
SPICE evaluation took: 17.14 s
SPICE: 0.215
2022-03-17 13:54:47,184.184 2829:tagger_caption_uni_pipeline_expanding.py:1010   evaluate(): evaluation result: {'Bleu_1': 0.767416435527286, 'Bleu_2': 0.6055344731327534, 'Bleu_3': 0.4609192108624659, 'Bleu_4': 0.3462429416852415, 'METEOR': 0.28285371011891847, 'ROUGE_L': 0.5700402947698539, 'CIDEr': 1.1751740942439957, 'SPICE': 0.2154756095250435}
2022-03-17 13:54:47,185.185 2829:tagger_caption_uni_pipeline_expanding.py:1011   evaluate(): evaluation result saved to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.report
2022-03-17 13:54:47,210.210 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json
2022-03-17 13:54:47,210.210 2829:modeling_utils.py:211 from_pretrained(): Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "image_captioning",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "TIMM_vit",
  "net": "vit_base_patch16_384",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pretrained": true,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
2022-03-17 13:54:47,895.895 2829:modeling_bert.py:529   __init__(): TIMM Split image encoder load from pre-trained: True
2022-03-17 13:54:48,963.963 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:54:50,469.469 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:54:51,106.106 2829:modeling_bert.py:2677   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:54:51,254.254 2829:modeling_bert.py:2688   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:54:52,024.024 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight!  Note that this might be replaced by pre-trained checkpoint later!
2022-03-17 13:54:52,024.024 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode.
2022-03-17 13:54:53,137.137 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:54:53,734.734 2829:checkpoint.py:240       load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt
2022-03-17 13:55:01,820.820 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token                                loaded from image_encoder.module.cls_token                         of shape (1, 1, 768)
2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias                                loaded from image_encoder.module.head.bias                         of shape (1000,)
2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight                              loaded from image_encoder.module.head.weight                       of shape (1000, 768)
2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias                    loaded from image_encoder.module.patch_embed.proj.bias             of shape (768,)
2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight                  loaded from image_encoder.module.patch_embed.proj.weight           of shape (768, 3, 16, 16)
2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed                                loaded from image_encoder.module.pos_embed                         of shape (1, 577, 768)
2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias                         loaded from bert.caption_pooler.dense.bias                         of shape (768,)
2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight                       loaded from bert.caption_pooler.dense.weight                       of shape (768, 768)
2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias       loaded from bert.decoder.layer.0.attention.output.dense.bias       of shape (768,)
2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight     loaded from bert.decoder.layer.0.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias           loaded from bert.decoder.layer.0.attention.self.key.bias           of shape (768,)
2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight         loaded from bert.decoder.layer.0.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias         loaded from bert.decoder.layer.0.attention.self.query.bias         of shape (768,)
2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight       loaded from bert.decoder.layer.0.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias         loaded from bert.decoder.layer.0.attention.self.value.bias         of shape (768,)
2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight       loaded from bert.decoder.layer.0.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias           loaded from bert.decoder.layer.0.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight         loaded from bert.decoder.layer.0.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias             loaded from bert.decoder.layer.0.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight           loaded from bert.decoder.layer.0.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias                 loaded from bert.decoder.layer.0.output.dense.bias                 of shape (768,)
2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight               loaded from bert.decoder.layer.0.output.dense.weight               of shape (768, 3072)
2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias       loaded from bert.decoder.layer.1.attention.output.dense.bias       of shape (768,)
2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight     loaded from bert.decoder.layer.1.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias           loaded from bert.decoder.layer.1.attention.self.key.bias           of shape (768,)
2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight         loaded from bert.decoder.layer.1.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias         loaded from bert.decoder.layer.1.attention.self.query.bias         of shape (768,)
2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight       loaded from bert.decoder.layer.1.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias         loaded from bert.decoder.layer.1.attention.self.value.bias         of shape (768,)
2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight       loaded from bert.decoder.layer.1.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias           loaded from bert.decoder.layer.1.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight         loaded from bert.decoder.layer.1.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias             loaded from bert.decoder.layer.1.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight           loaded from bert.decoder.layer.1.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias                 loaded from bert.decoder.layer.1.output.dense.bias                 of shape (768,)
2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight               loaded from bert.decoder.layer.1.output.dense.weight               of shape (768, 3072)
2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias       loaded from bert.decoder.layer.2.attention.output.dense.bias       of shape (768,)
2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight     loaded from bert.decoder.layer.2.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias           loaded from bert.decoder.layer.2.attention.self.key.bias           of shape (768,)
2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight         loaded from bert.decoder.layer.2.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias         loaded from bert.decoder.layer.2.attention.self.query.bias         of shape (768,)
2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight       loaded from bert.decoder.layer.2.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias         loaded from bert.decoder.layer.2.attention.self.value.bias         of shape (768,)
2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight       loaded from bert.decoder.layer.2.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias           loaded from bert.decoder.layer.2.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight         loaded from bert.decoder.layer.2.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias             loaded from bert.decoder.layer.2.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight           loaded from bert.decoder.layer.2.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias                 loaded from bert.decoder.layer.2.output.dense.bias                 of shape (768,)
2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight               loaded from bert.decoder.layer.2.output.dense.weight               of shape (768, 3072)
2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias       loaded from bert.decoder.layer.3.attention.output.dense.bias       of shape (768,)
2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight     loaded from bert.decoder.layer.3.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias           loaded from bert.decoder.layer.3.attention.self.key.bias           of shape (768,)
2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight         loaded from bert.decoder.layer.3.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias         loaded from bert.decoder.layer.3.attention.self.query.bias         of shape (768,)
2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight       loaded from bert.decoder.layer.3.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias         loaded from bert.decoder.layer.3.attention.self.value.bias         of shape (768,)
2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight       loaded from bert.decoder.layer.3.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias           loaded from bert.decoder.layer.3.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight         loaded from bert.decoder.layer.3.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias             loaded from bert.decoder.layer.3.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight           loaded from bert.decoder.layer.3.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias                 loaded from bert.decoder.layer.3.output.dense.bias                 of shape (768,)
2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight               loaded from bert.decoder.layer.3.output.dense.weight               of shape (768, 3072)
2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias                         loaded from bert.embeddings.LayerNorm.bias                         of shape (768,)
2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight                       loaded from bert.embeddings.LayerNorm.weight                       of shape (768,)
2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight             loaded from bert.embeddings.position_embeddings.weight             of shape (512, 768)
2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight           loaded from bert.embeddings.token_type_embeddings.weight           of shape (2, 768)
2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight                 loaded from bert.embeddings.word_embeddings.weight                 of shape (30522, 768)
2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias                   loaded from bert.encoder.blocks.0.attn.proj.bias                   of shape (768,)
2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight                 loaded from bert.encoder.blocks.0.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias                    loaded from bert.encoder.blocks.0.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight                  loaded from bert.encoder.blocks.0.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias                     loaded from bert.encoder.blocks.0.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight                   loaded from bert.encoder.blocks.0.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias                     loaded from bert.encoder.blocks.0.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight                   loaded from bert.encoder.blocks.0.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias                       loaded from bert.encoder.blocks.0.norm1.bias                       of shape (768,)
2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight                     loaded from bert.encoder.blocks.0.norm1.weight                     of shape (768,)
2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias                       loaded from bert.encoder.blocks.0.norm2.bias                       of shape (768,)
2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight                     loaded from bert.encoder.blocks.0.norm2.weight                     of shape (768,)
2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias                   loaded from bert.encoder.blocks.1.attn.proj.bias                   of shape (768,)
2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight                 loaded from bert.encoder.blocks.1.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias                    loaded from bert.encoder.blocks.1.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight                  loaded from bert.encoder.blocks.1.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias                     loaded from bert.encoder.blocks.1.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight                   loaded from bert.encoder.blocks.1.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias                     loaded from bert.encoder.blocks.1.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight                   loaded from bert.encoder.blocks.1.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias                       loaded from bert.encoder.blocks.1.norm1.bias                       of shape (768,)
2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight                     loaded from bert.encoder.blocks.1.norm1.weight                     of shape (768,)
2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias                       loaded from bert.encoder.blocks.1.norm2.bias                       of shape (768,)
2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight                     loaded from bert.encoder.blocks.1.norm2.weight                     of shape (768,)
2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias                  loaded from bert.encoder.blocks.10.attn.proj.bias                  of shape (768,)
2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight                loaded from bert.encoder.blocks.10.attn.proj.weight                of shape (768, 768)
2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias                   loaded from bert.encoder.blocks.10.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight                 loaded from bert.encoder.blocks.10.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias                    loaded from bert.encoder.blocks.10.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight                  loaded from bert.encoder.blocks.10.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias                    loaded from bert.encoder.blocks.10.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight                  loaded from bert.encoder.blocks.10.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias                      loaded from bert.encoder.blocks.10.norm1.bias                      of shape (768,)
2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight                    loaded from bert.encoder.blocks.10.norm1.weight                    of shape (768,)
2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias                      loaded from bert.encoder.blocks.10.norm2.bias                      of shape (768,)
2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight                    loaded from bert.encoder.blocks.10.norm2.weight                    of shape (768,)
2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias                  loaded from bert.encoder.blocks.11.attn.proj.bias                  of shape (768,)
2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight                loaded from bert.encoder.blocks.11.attn.proj.weight                of shape (768, 768)
2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias                   loaded from bert.encoder.blocks.11.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight                 loaded from bert.encoder.blocks.11.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias                    loaded from bert.encoder.blocks.11.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight                  loaded from bert.encoder.blocks.11.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias                    loaded from bert.encoder.blocks.11.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight                  loaded from bert.encoder.blocks.11.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias                      loaded from bert.encoder.blocks.11.norm1.bias                      of shape (768,)
2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight                    loaded from bert.encoder.blocks.11.norm1.weight                    of shape (768,)
2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias                      loaded from bert.encoder.blocks.11.norm2.bias                      of shape (768,)
2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight                    loaded from bert.encoder.blocks.11.norm2.weight                    of shape (768,)
2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias                   loaded from bert.encoder.blocks.2.attn.proj.bias                   of shape (768,)
2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight                 loaded from bert.encoder.blocks.2.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias                    loaded from bert.encoder.blocks.2.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight                  loaded from bert.encoder.blocks.2.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias                     loaded from bert.encoder.blocks.2.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight                   loaded from bert.encoder.blocks.2.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias                     loaded from bert.encoder.blocks.2.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight                   loaded from bert.encoder.blocks.2.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias                       loaded from bert.encoder.blocks.2.norm1.bias                       of shape (768,)
2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight                     loaded from bert.encoder.blocks.2.norm1.weight                     of shape (768,)
2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias                       loaded from bert.encoder.blocks.2.norm2.bias                       of shape (768,)
2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight                     loaded from bert.encoder.blocks.2.norm2.weight                     of shape (768,)
2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias                   loaded from bert.encoder.blocks.3.attn.proj.bias                   of shape (768,)
2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight                 loaded from bert.encoder.blocks.3.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias                    loaded from bert.encoder.blocks.3.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight                  loaded from bert.encoder.blocks.3.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias                     loaded from bert.encoder.blocks.3.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight                   loaded from bert.encoder.blocks.3.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias                     loaded from bert.encoder.blocks.3.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight                   loaded from bert.encoder.blocks.3.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias                       loaded from bert.encoder.blocks.3.norm1.bias                       of shape (768,)
2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight                     loaded from bert.encoder.blocks.3.norm1.weight                     of shape (768,)
2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias                       loaded from bert.encoder.blocks.3.norm2.bias                       of shape (768,)
2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight                     loaded from bert.encoder.blocks.3.norm2.weight                     of shape (768,)
2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias                   loaded from bert.encoder.blocks.4.attn.proj.bias                   of shape (768,)
2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight                 loaded from bert.encoder.blocks.4.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias                    loaded from bert.encoder.blocks.4.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight                  loaded from bert.encoder.blocks.4.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias                     loaded from bert.encoder.blocks.4.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight                   loaded from bert.encoder.blocks.4.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias                     loaded from bert.encoder.blocks.4.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight                   loaded from bert.encoder.blocks.4.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias                       loaded from bert.encoder.blocks.4.norm1.bias                       of shape (768,)
2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight                     loaded from bert.encoder.blocks.4.norm1.weight                     of shape (768,)
2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias                       loaded from bert.encoder.blocks.4.norm2.bias                       of shape (768,)
2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight                     loaded from bert.encoder.blocks.4.norm2.weight                     of shape (768,)
2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias                   loaded from bert.encoder.blocks.5.attn.proj.bias                   of shape (768,)
2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight                 loaded from bert.encoder.blocks.5.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias                    loaded from bert.encoder.blocks.5.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight                  loaded from bert.encoder.blocks.5.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias                     loaded from bert.encoder.blocks.5.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight                   loaded from bert.encoder.blocks.5.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias                     loaded from bert.encoder.blocks.5.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight                   loaded from bert.encoder.blocks.5.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias                       loaded from bert.encoder.blocks.5.norm1.bias                       of shape (768,)
2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight                     loaded from bert.encoder.blocks.5.norm1.weight                     of shape (768,)
2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias                       loaded from bert.encoder.blocks.5.norm2.bias                       of shape (768,)
2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight                     loaded from bert.encoder.blocks.5.norm2.weight                     of shape (768,)
2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias                   loaded from bert.encoder.blocks.6.attn.proj.bias                   of shape (768,)
2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight                 loaded from bert.encoder.blocks.6.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias                    loaded from bert.encoder.blocks.6.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight                  loaded from bert.encoder.blocks.6.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias                     loaded from bert.encoder.blocks.6.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight                   loaded from bert.encoder.blocks.6.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias                     loaded from bert.encoder.blocks.6.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight                   loaded from bert.encoder.blocks.6.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias                       loaded from bert.encoder.blocks.6.norm1.bias                       of shape (768,)
2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight                     loaded from bert.encoder.blocks.6.norm1.weight                     of shape (768,)
2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias                       loaded from bert.encoder.blocks.6.norm2.bias                       of shape (768,)
2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight                     loaded from bert.encoder.blocks.6.norm2.weight                     of shape (768,)
2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias                   loaded from bert.encoder.blocks.7.attn.proj.bias                   of shape (768,)
2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight                 loaded from bert.encoder.blocks.7.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias                    loaded from bert.encoder.blocks.7.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight                  loaded from bert.encoder.blocks.7.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias                     loaded from bert.encoder.blocks.7.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight                   loaded from bert.encoder.blocks.7.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias                     loaded from bert.encoder.blocks.7.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight                   loaded from bert.encoder.blocks.7.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias                       loaded from bert.encoder.blocks.7.norm1.bias                       of shape (768,)
2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight                     loaded from bert.encoder.blocks.7.norm1.weight                     of shape (768,)
2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias                       loaded from bert.encoder.blocks.7.norm2.bias                       of shape (768,)
2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight                     loaded from bert.encoder.blocks.7.norm2.weight                     of shape (768,)
2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias                   loaded from bert.encoder.blocks.8.attn.proj.bias                   of shape (768,)
2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight                 loaded from bert.encoder.blocks.8.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias                    loaded from bert.encoder.blocks.8.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight                  loaded from bert.encoder.blocks.8.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias                     loaded from bert.encoder.blocks.8.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight                   loaded from bert.encoder.blocks.8.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias                     loaded from bert.encoder.blocks.8.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight                   loaded from bert.encoder.blocks.8.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias                       loaded from bert.encoder.blocks.8.norm1.bias                       of shape (768,)
2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight                     loaded from bert.encoder.blocks.8.norm1.weight                     of shape (768,)
2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias                       loaded from bert.encoder.blocks.8.norm2.bias                       of shape (768,)
2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight                     loaded from bert.encoder.blocks.8.norm2.weight                     of shape (768,)
2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias                   loaded from bert.encoder.blocks.9.attn.proj.bias                   of shape (768,)
2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight                 loaded from bert.encoder.blocks.9.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias                    loaded from bert.encoder.blocks.9.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight                  loaded from bert.encoder.blocks.9.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias                     loaded from bert.encoder.blocks.9.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight                   loaded from bert.encoder.blocks.9.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias                     loaded from bert.encoder.blocks.9.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight                   loaded from bert.encoder.blocks.9.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias                       loaded from bert.encoder.blocks.9.norm1.bias                       of shape (768,)
2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight                     loaded from bert.encoder.blocks.9.norm1.weight                     of shape (768,)
2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias                       loaded from bert.encoder.blocks.9.norm2.bias                       of shape (768,)
2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight                     loaded from bert.encoder.blocks.9.norm2.weight                     of shape (768,)
2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias               loaded from bert.encoder.tag_blocks.0.attn.proj.bias               of shape (768,)
2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight             loaded from bert.encoder.tag_blocks.0.attn.proj.weight             of shape (768, 768)
2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias                loaded from bert.encoder.tag_blocks.0.attn.qkv.bias                of shape (2304,)
2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight              loaded from bert.encoder.tag_blocks.0.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias                   loaded from bert.encoder.tag_blocks.0.norm1.bias                   of shape (768,)
2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight                 loaded from bert.encoder.tag_blocks.0.norm1.weight                 of shape (768,)
2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias                   loaded from bert.encoder.tag_blocks.0.norm2.bias                   of shape (768,)
2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight                 loaded from bert.encoder.tag_blocks.0.norm2.weight                 of shape (768,)
2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias               loaded from bert.encoder.tag_blocks.1.attn.proj.bias               of shape (768,)
2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight             loaded from bert.encoder.tag_blocks.1.attn.proj.weight             of shape (768, 768)
2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias                loaded from bert.encoder.tag_blocks.1.attn.qkv.bias                of shape (2304,)
2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight              loaded from bert.encoder.tag_blocks.1.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias                   loaded from bert.encoder.tag_blocks.1.norm1.bias                   of shape (768,)
2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight                 loaded from bert.encoder.tag_blocks.1.norm1.weight                 of shape (768,)
2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias                   loaded from bert.encoder.tag_blocks.1.norm2.bias                   of shape (768,)
2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight                 loaded from bert.encoder.tag_blocks.1.norm2.weight                 of shape (768,)
2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias               loaded from bert.encoder.tag_blocks.2.attn.proj.bias               of shape (768,)
2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight             loaded from bert.encoder.tag_blocks.2.attn.proj.weight             of shape (768, 768)
2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias                loaded from bert.encoder.tag_blocks.2.attn.qkv.bias                of shape (2304,)
2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight              loaded from bert.encoder.tag_blocks.2.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias                   loaded from bert.encoder.tag_blocks.2.norm1.bias                   of shape (768,)
2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight                 loaded from bert.encoder.tag_blocks.2.norm1.weight                 of shape (768,)
2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias                   loaded from bert.encoder.tag_blocks.2.norm2.bias                   of shape (768,)
2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight                 loaded from bert.encoder.tag_blocks.2.norm2.weight                 of shape (768,)
2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias               loaded from bert.encoder.tag_blocks.3.attn.proj.bias               of shape (768,)
2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight             loaded from bert.encoder.tag_blocks.3.attn.proj.weight             of shape (768, 768)
2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias                loaded from bert.encoder.tag_blocks.3.attn.qkv.bias                of shape (2304,)
2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight              loaded from bert.encoder.tag_blocks.3.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias                   loaded from bert.encoder.tag_blocks.3.norm1.bias                   of shape (768,)
2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight                 loaded from bert.encoder.tag_blocks.3.norm1.weight                 of shape (768,)
2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias                   loaded from bert.encoder.tag_blocks.3.norm2.bias                   of shape (768,)
2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight                 loaded from bert.encoder.tag_blocks.3.norm2.weight                 of shape (768,)
2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias                   loaded from bert.extra_embeddings.LayerNorm.bias                   of shape (768,)
2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight                 loaded from bert.extra_embeddings.LayerNorm.weight                 of shape (768,)
2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight       loaded from bert.extra_embeddings.position_embeddings.weight       of shape (512, 768)
2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight     loaded from bert.extra_embeddings.token_type_embeddings.weight     of shape (2, 768)
2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight           loaded from bert.extra_embeddings.word_embeddings.weight           of shape (30522, 768)
2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias                                 loaded from bert.pooler.dense.bias                                 of shape (768,)
2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight                               loaded from bert.pooler.dense.weight                               of shape (768, 768)
2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias                        loaded from bert.tag_logit.predictions.bias                        of shape (30522,)
2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight              loaded from bert.tag_logit.predictions.decoder.weight              of shape (30522, 768)
2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias    loaded from bert.tag_logit.predictions.transform.LayerNorm.bias    of shape (768,)
2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight  loaded from bert.tag_logit.predictions.transform.LayerNorm.weight  of shape (768,)
2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias        loaded from bert.tag_logit.predictions.transform.dense.bias        of shape (768,)
2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight      loaded from bert.tag_logit.predictions.transform.dense.weight      of shape (768, 768)
2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias                                   loaded from cls.predictions.bias                                   of shape (30522,)
2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight                         loaded from cls.predictions.decoder.weight                         of shape (30522, 768)
2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias               loaded from cls.predictions.transform.LayerNorm.bias               of shape (768,)
2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight             loaded from cls.predictions.transform.LayerNorm.weight             of shape (768,)
2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias                   loaded from cls.predictions.transform.dense.bias                   of shape (768,)
2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight                 loaded from cls.predictions.transform.dense.weight                 of shape (768, 768)
2022-03-17 13:55:01,842.842 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288
2022-03-17 13:55:01,843.843 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = []
2022-03-17 13:55:01,846.846 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1
2022-03-17 13:55:02,008.008 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight']
2022-03-17 13:55:02,331.331 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = <src.qd.data_layer.transform.LoadImage object at 0x7fcba4dedbd0>; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False
2022-03-17 13:55:02,357.357 2829:uni_pipeline.py:509 get_data_loader(): sampler = <src.qd.data_layer.samplers.DistributedSampler object at 0x7fccb0d722d0>
2022-03-17 13:55:02,358.358 2829:uni_pipeline.py:1000    predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:55:02,431.431 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=<src.qd.data_layer.dataset.ImageIdxTSVDataset object at 0x7fcba4ded050>, transform=Compose(
    <src.qd.data_layer.transform.LoadHW object at 0x7fcba4ded090>
    Compose(
    <src.qd.data_layer.transform.LoadImage object at 0x7fcba4dedbd0>
    ImageTransform2Dict(image_transform=Compose(
    <src.qd.data_layer.transform.BGR2RGB object at 0x7fcbcd6a4050>
    ToPILImage()
    Resize(size=384, interpolation=PIL.Image.BICUBIC)
    CenterCrop(size=(384, 384))
    ToTensor()
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
))
)
    LoadLabel(data=TaxCocoCaption, split=test, version=vinvl)
    <src.qd.data_layer.transform.IdentifyTextAB object at 0x7fcbcd6a4d90>
    TransCaptionTensorizer(tensorizer=<src.qd.mask.data.datasets.caption_tensorizer.CaptionTensorizer object at 0x7fcbcd9b5710>, pad_to_max=True, pad_image_to_max=True)
    <src.qd.pipelines.tagger_caption_uni_pipeline_expanding.Tensorizer object at 0x7fccb0d72410>
    <src.qd.data_layer.transform.RemoveUselessKeys object at 0x7fccb0d723d0>
    <src.qd.data_layer.transform.RenameKey object at 0x7fccb0d72390>
))

uni_pipeline.py:908:   0%|          | 0/10 [00:00<?, ?it/s]/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1639: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

uni_pipeline.py:908:  10%|█         | 1/10 [00:05<00:46,  5.17s/it]
uni_pipeline.py:908:  20%|██        | 2/10 [00:10<00:42,  5.37s/it]
uni_pipeline.py:908:  30%|███       | 3/10 [00:15<00:36,  5.28s/it]
uni_pipeline.py:908:  40%|████      | 4/10 [00:21<00:31,  5.24s/it]
uni_pipeline.py:908:  50%|█████     | 5/10 [00:26<00:26,  5.22s/it]
uni_pipeline.py:908:  60%|██████    | 6/10 [00:31<00:20,  5.09s/it]
uni_pipeline.py:908:  70%|███████   | 7/10 [00:36<00:15,  5.12s/it]
uni_pipeline.py:908:  80%|████████  | 8/10 [00:41<00:10,  5.25s/it]
uni_pipeline.py:908:  90%|█████████ | 9/10 [00:46<00:05,  5.23s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:51<00:00,  5.00s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:52<00:00,  5.22s/it]
2022-03-17 13:55:55,884.884 2829:qd_common.py:625    cmd_run(): start to cmd run: nvidia-smi
2022-03-17 13:55:57,353.353 2829:qd_common.py:670    cmd_run(): finished the cmd run
2022-03-17 13:55:58,026.026 2829:uni_pipeline.py:1008    predict(): data: 0.0003 (0.0551)  input_to_cuda: 0.0029 (0.0030)  model: 5.1696 (5.1028)  write: 0.0047 (0.0046)
2022-03-17 13:56:13,652.652 2829:qd_common.py:1157 concat_files(): concating 0/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:56:13,653.653 2829:qd_common.py:1157 concat_files(): concating 1/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_1_32.tsv
2022-03-17 13:56:13,654.654 2829:qd_common.py:1157 concat_files(): concating 2/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_2_32.tsv
2022-03-17 13:56:13,655.655 2829:qd_common.py:1157 concat_files(): concating 3/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_3_32.tsv
2022-03-17 13:56:13,655.655 2829:qd_common.py:1157 concat_files(): concating 4/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_4_32.tsv
2022-03-17 13:56:13,656.656 2829:qd_common.py:1157 concat_files(): concating 5/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_5_32.tsv
2022-03-17 13:56:13,657.657 2829:qd_common.py:1157 concat_files(): concating 6/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_6_32.tsv
2022-03-17 13:56:13,657.657 2829:qd_common.py:1157 concat_files(): concating 7/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_7_32.tsv
2022-03-17 13:56:13,658.658 2829:qd_common.py:1157 concat_files(): concating 8/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_8_32.tsv
2022-03-17 13:56:13,729.729 2829:qd_common.py:1157 concat_files(): concating 9/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_9_32.tsv
2022-03-17 13:56:13,806.806 2829:qd_common.py:1157 concat_files(): concating 10/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_10_32.tsv
2022-03-17 13:56:13,877.877 2829:qd_common.py:1157 concat_files(): concating 11/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_11_32.tsv
2022-03-17 13:56:13,948.948 2829:qd_common.py:1157 concat_files(): concating 12/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_12_32.tsv
2022-03-17 13:56:14,019.019 2829:qd_common.py:1157 concat_files(): concating 13/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_13_32.tsv
2022-03-17 13:56:14,090.090 2829:qd_common.py:1157 concat_files(): concating 14/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_14_32.tsv
2022-03-17 13:56:14,162.162 2829:qd_common.py:1157 concat_files(): concating 15/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_15_32.tsv
2022-03-17 13:56:14,233.233 2829:qd_common.py:1157 concat_files(): concating 16/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_16_32.tsv
2022-03-17 13:56:14,305.305 2829:qd_common.py:1157 concat_files(): concating 17/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_17_32.tsv
2022-03-17 13:56:14,375.375 2829:qd_common.py:1157 concat_files(): concating 18/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_18_32.tsv
2022-03-17 13:56:14,446.446 2829:qd_common.py:1157 concat_files(): concating 19/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_19_32.tsv
2022-03-17 13:56:14,517.517 2829:qd_common.py:1157 concat_files(): concating 20/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_20_32.tsv
2022-03-17 13:56:14,589.589 2829:qd_common.py:1157 concat_files(): concating 21/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_21_32.tsv
2022-03-17 13:56:14,661.661 2829:qd_common.py:1157 concat_files(): concating 22/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_22_32.tsv
2022-03-17 13:56:14,734.734 2829:qd_common.py:1157 concat_files(): concating 23/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_23_32.tsv
2022-03-17 13:56:14,805.805 2829:qd_common.py:1157 concat_files(): concating 24/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_24_32.tsv
2022-03-17 13:56:14,877.877 2829:qd_common.py:1157 concat_files(): concating 25/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_25_32.tsv
2022-03-17 13:56:14,947.947 2829:qd_common.py:1157 concat_files(): concating 26/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_26_32.tsv
2022-03-17 13:56:15,019.019 2829:qd_common.py:1157 concat_files(): concating 27/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_27_32.tsv
2022-03-17 13:56:15,096.096 2829:qd_common.py:1157 concat_files(): concating 28/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_28_32.tsv
2022-03-17 13:56:15,167.167 2829:qd_common.py:1157 concat_files(): concating 29/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_29_32.tsv
2022-03-17 13:56:15,238.238 2829:qd_common.py:1157 concat_files(): concating 30/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_30_32.tsv
2022-03-17 13:56:15,310.310 2829:qd_common.py:1157 concat_files(): concating 31/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_31_32.tsv
2022-03-17 13:56:17,585.585 2829:tsv_io.py:66 reorder_tsv_keys(): loading keys in input
2022-03-17 13:56:17,611.611 2829:tsv_io.py:328 _ensure_lineidx_loaded(): loaded 5024 from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv.before.reorder.lineidx

tsv_io.py:67:   0%|          | 0/5024 [00:00<?, ?it/s]
tsv_io.py:67: 100%|██████████| 5024/5024 [00:00<00:00, 101234.11it/s]
2022-03-17 13:56:17,735.735 2829:tsv_io.py:70   gen_rows(): writing

tsv_io.py:71:   0%|          | 0/5000 [00:00<?, ?it/s]
tsv_io.py:71: 100%|██████████| 5000/5000 [00:00<00:00, 71166.13it/s]
2022-03-17 13:56:34,911.911 2829:qd_common.py:625    cmd_run(): start to cmd run: rsync output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv.speed.vis.txt.tmp --progress
model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt

         32,768  33%    0.00kB/s    0:00:00  
         98,002 100%   62.21MB/s    0:00:00 (xfr#1, to-chk=0/1)
loading annotations into memory...
0:00:00.022308
creating index...
index created!
Loading and preparing results...     
DONE (t=0.01s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 307086 tokens at 1412232.97 tokens per second.
PTBTokenizer tokenized 59076 tokens at 498052.60 tokens per second.
setting up scorers...
computing Bleu score...
{'testlen': 49193, 'reflen': 48419, 'guess': [49193, 44193, 39193, 34193], 'correct': [37878, 21329, 10648, 5153]}
ratio: 1.0159854602531855
Bleu_1: 0.770
Bleu_2: 0.610
Bleu_3: 0.466
Bleu_4: 0.351
computing METEOR score...
METEOR: 0.284
computing Rouge score...
ROUGE_L: 0.571
computing CIDEr score...
CIDEr: 1.193
computing SPICE score...
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.nustaq.serialization.FSTClazzInfo (file:/mnt/batch/tasks/shared/LS_root/jobs/vlp32gb/azureml/zhiyuan-pytorch-test_1647405376_841ce8bd/mounts/vig_data/jianfw/data/qd_data/coco_caption/pycocoevalcap/spice/lib/fst-2.47.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of org.nustaq.serialization.FSTClazzInfo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Parsing reference captions
Parsing test captions
Initiating Stanford parsing pipeline
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... 
done [0.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [0.9 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.5 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.4 sec].
Threads( StanfordCoreNLP ) [3.68 seconds]
Warning: Nashorn engine is planned to be removed from a future JDK release
SPICE evaluation took: 17.23 s
SPICE: 0.217
2022-03-17 13:57:14,529.529 2829:tagger_caption_uni_pipeline_expanding.py:1010   evaluate(): evaluation result: {'Bleu_1': 0.7699875998617534, 'Bleu_2': 0.6096075970159268, 'Bleu_3': 0.46564337181935755, 'Bleu_4': 0.3512131781811264, 'METEOR': 0.283775986746145, 'ROUGE_L': 0.5708366767760131, 'CIDEr': 1.1932765960954337, 'SPICE': 0.2174226708593449}
2022-03-17 13:57:14,529.529 2829:tagger_caption_uni_pipeline_expanding.py:1011   evaluate(): evaluation result saved to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.report
2022-03-17 13:57:14,554.554 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json
2022-03-17 13:57:14,555.555 2829:modeling_utils.py:211 from_pretrained(): Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "image_captioning",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "TIMM_vit",
  "net": "vit_base_patch16_384",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pretrained": true,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
2022-03-17 13:57:15,241.241 2829:modeling_bert.py:529   __init__(): TIMM Split image encoder load from pre-trained: True
2022-03-17 13:57:16,308.308 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:57:17,851.851 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:57:18,503.503 2829:modeling_bert.py:2677   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:57:18,651.651 2829:modeling_bert.py:2688   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:57:19,440.440 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight!  Note that this might be replaced by pre-trained checkpoint later!
2022-03-17 13:57:19,440.440 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode.
2022-03-17 13:57:20,571.571 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:57:21,104.104 2829:checkpoint.py:240       load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt
2022-03-17 13:57:29,279.279 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token                                loaded from image_encoder.module.cls_token                         of shape (1, 1, 768)
2022-03-17 13:57:29,279.279 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias                                loaded from image_encoder.module.head.bias                         of shape (1000,)
2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight                              loaded from image_encoder.module.head.weight                       of shape (1000, 768)
2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias                    loaded from image_encoder.module.patch_embed.proj.bias             of shape (768,)
2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight                  loaded from image_encoder.module.patch_embed.proj.weight           of shape (768, 3, 16, 16)
2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed                                loaded from image_encoder.module.pos_embed                         of shape (1, 577, 768)
2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias                         loaded from bert.caption_pooler.dense.bias                         of shape (768,)
2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight                       loaded from bert.caption_pooler.dense.weight                       of shape (768, 768)
2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias       loaded from bert.decoder.layer.0.attention.output.dense.bias       of shape (768,)
2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight     loaded from bert.decoder.layer.0.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias           loaded from bert.decoder.layer.0.attention.self.key.bias           of shape (768,)
2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight         loaded from bert.decoder.layer.0.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias         loaded from bert.decoder.layer.0.attention.self.query.bias         of shape (768,)
2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight       loaded from bert.decoder.layer.0.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias         loaded from bert.decoder.layer.0.attention.self.value.bias         of shape (768,)
2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight       loaded from bert.decoder.layer.0.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias           loaded from bert.decoder.layer.0.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight         loaded from bert.decoder.layer.0.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias             loaded from bert.decoder.layer.0.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight           loaded from bert.decoder.layer.0.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias                 loaded from bert.decoder.layer.0.output.dense.bias                 of shape (768,)
2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight               loaded from bert.decoder.layer.0.output.dense.weight               of shape (768, 3072)
2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias       loaded from bert.decoder.layer.1.attention.output.dense.bias       of shape (768,)
2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight     loaded from bert.decoder.layer.1.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias           loaded from bert.decoder.layer.1.attention.self.key.bias           of shape (768,)
2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight         loaded from bert.decoder.layer.1.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias         loaded from bert.decoder.layer.1.attention.self.query.bias         of shape (768,)
2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight       loaded from bert.decoder.layer.1.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias         loaded from bert.decoder.layer.1.attention.self.value.bias         of shape (768,)
2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight       loaded from bert.decoder.layer.1.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias           loaded from bert.decoder.layer.1.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight         loaded from bert.decoder.layer.1.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias             loaded from bert.decoder.layer.1.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight           loaded from bert.decoder.layer.1.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias                 loaded from bert.decoder.layer.1.output.dense.bias                 of shape (768,)
2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight               loaded from bert.decoder.layer.1.output.dense.weight               of shape (768, 3072)
2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias       loaded from bert.decoder.layer.2.attention.output.dense.bias       of shape (768,)
2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight     loaded from bert.decoder.layer.2.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias           loaded from bert.decoder.layer.2.attention.self.key.bias           of shape (768,)
2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight         loaded from bert.decoder.layer.2.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias         loaded from bert.decoder.layer.2.attention.self.query.bias         of shape (768,)
2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight       loaded from bert.decoder.layer.2.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias         loaded from bert.decoder.layer.2.attention.self.value.bias         of shape (768,)
2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight       loaded from bert.decoder.layer.2.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias           loaded from bert.decoder.layer.2.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight         loaded from bert.decoder.layer.2.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias             loaded from bert.decoder.layer.2.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight           loaded from bert.decoder.layer.2.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias                 loaded from bert.decoder.layer.2.output.dense.bias                 of shape (768,)
2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight               loaded from bert.decoder.layer.2.output.dense.weight               of shape (768, 3072)
2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias       loaded from bert.decoder.layer.3.attention.output.dense.bias       of shape (768,)
2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight     loaded from bert.decoder.layer.3.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias           loaded from bert.decoder.layer.3.attention.self.key.bias           of shape (768,)
2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight         loaded from bert.decoder.layer.3.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias         loaded from bert.decoder.layer.3.attention.self.query.bias         of shape (768,)
2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight       loaded from bert.decoder.layer.3.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias         loaded from bert.decoder.layer.3.attention.self.value.bias         of shape (768,)
2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight       loaded from bert.decoder.layer.3.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias           loaded from bert.decoder.layer.3.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight         loaded from bert.decoder.layer.3.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias             loaded from bert.decoder.layer.3.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight           loaded from bert.decoder.layer.3.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias                 loaded from bert.decoder.layer.3.output.dense.bias                 of shape (768,)
2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight               loaded from bert.decoder.layer.3.output.dense.weight               of shape (768, 3072)
2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias                         loaded from bert.embeddings.LayerNorm.bias                         of shape (768,)
2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight                       loaded from bert.embeddings.LayerNorm.weight                       of shape (768,)
2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight             loaded from bert.embeddings.position_embeddings.weight             of shape (512, 768)
2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight           loaded from bert.embeddings.token_type_embeddings.weight           of shape (2, 768)
2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight                 loaded from bert.embeddings.word_embeddings.weight                 of shape (30522, 768)
2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias                   loaded from bert.encoder.blocks.0.attn.proj.bias                   of shape (768,)
2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight                 loaded from bert.encoder.blocks.0.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias                    loaded from bert.encoder.blocks.0.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight                  loaded from bert.encoder.blocks.0.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias                     loaded from bert.encoder.blocks.0.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight                   loaded from bert.encoder.blocks.0.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias                     loaded from bert.encoder.blocks.0.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight                   loaded from bert.encoder.blocks.0.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias                       loaded from bert.encoder.blocks.0.norm1.bias                       of shape (768,)
2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight                     loaded from bert.encoder.blocks.0.norm1.weight                     of shape (768,)
2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias                       loaded from bert.encoder.blocks.0.norm2.bias                       of shape (768,)
2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight                     loaded from bert.encoder.blocks.0.norm2.weight                     of shape (768,)
2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias                   loaded from bert.encoder.blocks.1.attn.proj.bias                   of shape (768,)
2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight                 loaded from bert.encoder.blocks.1.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias                    loaded from bert.encoder.blocks.1.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight                  loaded from bert.encoder.blocks.1.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias                     loaded from bert.encoder.blocks.1.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight                   loaded from bert.encoder.blocks.1.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias                     loaded from bert.encoder.blocks.1.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight                   loaded from bert.encoder.blocks.1.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias                       loaded from bert.encoder.blocks.1.norm1.bias                       of shape (768,)
2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight                     loaded from bert.encoder.blocks.1.norm1.weight                     of shape (768,)
2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias                       loaded from bert.encoder.blocks.1.norm2.bias                       of shape (768,)
2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight                     loaded from bert.encoder.blocks.1.norm2.weight                     of shape (768,)
2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias                  loaded from bert.encoder.blocks.10.attn.proj.bias                  of shape (768,)
2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight                loaded from bert.encoder.blocks.10.attn.proj.weight                of shape (768, 768)
2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias                   loaded from bert.encoder.blocks.10.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight                 loaded from bert.encoder.blocks.10.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias                    loaded from bert.encoder.blocks.10.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight                  loaded from bert.encoder.blocks.10.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias                    loaded from bert.encoder.blocks.10.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight                  loaded from bert.encoder.blocks.10.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias                      loaded from bert.encoder.blocks.10.norm1.bias                      of shape (768,)
2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight                    loaded from bert.encoder.blocks.10.norm1.weight                    of shape (768,)
2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias                      loaded from bert.encoder.blocks.10.norm2.bias                      of shape (768,)
2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight                    loaded from bert.encoder.blocks.10.norm2.weight                    of shape (768,)
2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias                  loaded from bert.encoder.blocks.11.attn.proj.bias                  of shape (768,)
2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight                loaded from bert.encoder.blocks.11.attn.proj.weight                of shape (768, 768)
2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias                   loaded from bert.encoder.blocks.11.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight                 loaded from bert.encoder.blocks.11.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias                    loaded from bert.encoder.blocks.11.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight                  loaded from bert.encoder.blocks.11.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias                    loaded from bert.encoder.blocks.11.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight                  loaded from bert.encoder.blocks.11.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias                      loaded from bert.encoder.blocks.11.norm1.bias                      of shape (768,)
2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight                    loaded from bert.encoder.blocks.11.norm1.weight                    of shape (768,)
2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias                      loaded from bert.encoder.blocks.11.norm2.bias                      of shape (768,)
2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight                    loaded from bert.encoder.blocks.11.norm2.weight                    of shape (768,)
2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias                   loaded from bert.encoder.blocks.2.attn.proj.bias                   of shape (768,)
2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight                 loaded from bert.encoder.blocks.2.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias                    loaded from bert.encoder.blocks.2.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight                  loaded from bert.encoder.blocks.2.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias                     loaded from bert.encoder.blocks.2.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight                   loaded from bert.encoder.blocks.2.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias                     loaded from bert.encoder.blocks.2.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight                   loaded from bert.encoder.blocks.2.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias                       loaded from bert.encoder.blocks.2.norm1.bias                       of shape (768,)
2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight                     loaded from bert.encoder.blocks.2.norm1.weight                     of shape (768,)
2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias                       loaded from bert.encoder.blocks.2.norm2.bias                       of shape (768,)
2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight                     loaded from bert.encoder.blocks.2.norm2.weight                     of shape (768,)
2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias                   loaded from bert.encoder.blocks.3.attn.proj.bias                   of shape (768,)
2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight                 loaded from bert.encoder.blocks.3.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias                    loaded from bert.encoder.blocks.3.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight                  loaded from bert.encoder.blocks.3.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias                     loaded from bert.encoder.blocks.3.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight                   loaded from bert.encoder.blocks.3.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias                     loaded from bert.encoder.blocks.3.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight                   loaded from bert.encoder.blocks.3.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias                       loaded from bert.encoder.blocks.3.norm1.bias                       of shape (768,)
2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight                     loaded from bert.encoder.blocks.3.norm1.weight                     of shape (768,)
2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias                       loaded from bert.encoder.blocks.3.norm2.bias                       of shape (768,)
2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight                     loaded from bert.encoder.blocks.3.norm2.weight                     of shape (768,)
2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias                   loaded from bert.encoder.blocks.4.attn.proj.bias                   of shape (768,)
2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight                 loaded from bert.encoder.blocks.4.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias                    loaded from bert.encoder.blocks.4.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight                  loaded from bert.encoder.blocks.4.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias                     loaded from bert.encoder.blocks.4.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight                   loaded from bert.encoder.blocks.4.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias                     loaded from bert.encoder.blocks.4.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight                   loaded from bert.encoder.blocks.4.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias                       loaded from bert.encoder.blocks.4.norm1.bias                       of shape (768,)
2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight                     loaded from bert.encoder.blocks.4.norm1.weight                     of shape (768,)
2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias                       loaded from bert.encoder.blocks.4.norm2.bias                       of shape (768,)
2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight                     loaded from bert.encoder.blocks.4.norm2.weight                     of shape (768,)
2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias                   loaded from bert.encoder.blocks.5.attn.proj.bias                   of shape (768,)
2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight                 loaded from bert.encoder.blocks.5.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias                    loaded from bert.encoder.blocks.5.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight                  loaded from bert.encoder.blocks.5.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias                     loaded from bert.encoder.blocks.5.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight                   loaded from bert.encoder.blocks.5.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias                     loaded from bert.encoder.blocks.5.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight                   loaded from bert.encoder.blocks.5.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias                       loaded from bert.encoder.blocks.5.norm1.bias                       of shape (768,)
2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight                     loaded from bert.encoder.blocks.5.norm1.weight                     of shape (768,)
2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias                       loaded from bert.encoder.blocks.5.norm2.bias                       of shape (768,)
2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight                     loaded from bert.encoder.blocks.5.norm2.weight                     of shape (768,)
2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias                   loaded from bert.encoder.blocks.6.attn.proj.bias                   of shape (768,)
2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight                 loaded from bert.encoder.blocks.6.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias                    loaded from bert.encoder.blocks.6.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight                  loaded from bert.encoder.blocks.6.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias                     loaded from bert.encoder.blocks.6.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight                   loaded from bert.encoder.blocks.6.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias                     loaded from bert.encoder.blocks.6.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight                   loaded from bert.encoder.blocks.6.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias                       loaded from bert.encoder.blocks.6.norm1.bias                       of shape (768,)
2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight                     loaded from bert.encoder.blocks.6.norm1.weight                     of shape (768,)
2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias                       loaded from bert.encoder.blocks.6.norm2.bias                       of shape (768,)
2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight                     loaded from bert.encoder.blocks.6.norm2.weight                     of shape (768,)
2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias                   loaded from bert.encoder.blocks.7.attn.proj.bias                   of shape (768,)
2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight                 loaded from bert.encoder.blocks.7.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias                    loaded from bert.encoder.blocks.7.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight                  loaded from bert.encoder.blocks.7.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias                     loaded from bert.encoder.blocks.7.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight                   loaded from bert.encoder.blocks.7.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias                     loaded from bert.encoder.blocks.7.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight                   loaded from bert.encoder.blocks.7.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias                       loaded from bert.encoder.blocks.7.norm1.bias                       of shape (768,)
2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight                     loaded from bert.encoder.blocks.7.norm1.weight                     of shape (768,)
2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias                       loaded from bert.encoder.blocks.7.norm2.bias                       of shape (768,)
2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight                     loaded from bert.encoder.blocks.7.norm2.weight                     of shape (768,)
2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias                   loaded from bert.encoder.blocks.8.attn.proj.bias                   of shape (768,)
2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight                 loaded from bert.encoder.blocks.8.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias                    loaded from bert.encoder.blocks.8.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight                  loaded from bert.encoder.blocks.8.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias                     loaded from bert.encoder.blocks.8.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight                   loaded from bert.encoder.blocks.8.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias                     loaded from bert.encoder.blocks.8.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight                   loaded from bert.encoder.blocks.8.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias                       loaded from bert.encoder.blocks.8.norm1.bias                       of shape (768,)
2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight                     loaded from bert.encoder.blocks.8.norm1.weight                     of shape (768,)
2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias                       loaded from bert.encoder.blocks.8.norm2.bias                       of shape (768,)
2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight                     loaded from bert.encoder.blocks.8.norm2.weight                     of shape (768,)
2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias                   loaded from bert.encoder.blocks.9.attn.proj.bias                   of shape (768,)
2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight                 loaded from bert.encoder.blocks.9.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias                    loaded from bert.encoder.blocks.9.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight                  loaded from bert.encoder.blocks.9.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias                     loaded from bert.encoder.blocks.9.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight                   loaded from bert.encoder.blocks.9.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias                     loaded from bert.encoder.blocks.9.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight                   loaded from bert.encoder.blocks.9.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias                       loaded from bert.encoder.blocks.9.norm1.bias                       of shape (768,)
2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight                     loaded from bert.encoder.blocks.9.norm1.weight                     of shape (768,)
2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias                       loaded from bert.encoder.blocks.9.norm2.bias                       of shape (768,)
2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight                     loaded from bert.encoder.blocks.9.norm2.weight                     of shape (768,)
2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias               loaded from bert.encoder.tag_blocks.0.attn.proj.bias               of shape (768,)
2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight             loaded from bert.encoder.tag_blocks.0.attn.proj.weight             of shape (768, 768)
2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias                loaded from bert.encoder.tag_blocks.0.attn.qkv.bias                of shape (2304,)
2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight              loaded from bert.encoder.tag_blocks.0.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias                   loaded from bert.encoder.tag_blocks.0.norm1.bias                   of shape (768,)
2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight                 loaded from bert.encoder.tag_blocks.0.norm1.weight                 of shape (768,)
2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias                   loaded from bert.encoder.tag_blocks.0.norm2.bias                   of shape (768,)
2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight                 loaded from bert.encoder.tag_blocks.0.norm2.weight                 of shape (768,)
2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias               loaded from bert.encoder.tag_blocks.1.attn.proj.bias               of shape (768,)
2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight             loaded from bert.encoder.tag_blocks.1.attn.proj.weight             of shape (768, 768)
2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias                loaded from bert.encoder.tag_blocks.1.attn.qkv.bias                of shape (2304,)
2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight              loaded from bert.encoder.tag_blocks.1.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias                   loaded from bert.encoder.tag_blocks.1.norm1.bias                   of shape (768,)
2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight                 loaded from bert.encoder.tag_blocks.1.norm1.weight                 of shape (768,)
2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias                   loaded from bert.encoder.tag_blocks.1.norm2.bias                   of shape (768,)
2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight                 loaded from bert.encoder.tag_blocks.1.norm2.weight                 of shape (768,)
2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias               loaded from bert.encoder.tag_blocks.2.attn.proj.bias               of shape (768,)
2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight             loaded from bert.encoder.tag_blocks.2.attn.proj.weight             of shape (768, 768)
2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias                loaded from bert.encoder.tag_blocks.2.attn.qkv.bias                of shape (2304,)
2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight              loaded from bert.encoder.tag_blocks.2.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias                   loaded from bert.encoder.tag_blocks.2.norm1.bias                   of shape (768,)
2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight                 loaded from bert.encoder.tag_blocks.2.norm1.weight                 of shape (768,)
2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias                   loaded from bert.encoder.tag_blocks.2.norm2.bias                   of shape (768,)
2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight                 loaded from bert.encoder.tag_blocks.2.norm2.weight                 of shape (768,)
2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias               loaded from bert.encoder.tag_blocks.3.attn.proj.bias               of shape (768,)
2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight             loaded from bert.encoder.tag_blocks.3.attn.proj.weight             of shape (768, 768)
2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias                loaded from bert.encoder.tag_blocks.3.attn.qkv.bias                of shape (2304,)
2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight              loaded from bert.encoder.tag_blocks.3.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias                   loaded from bert.encoder.tag_blocks.3.norm1.bias                   of shape (768,)
2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight                 loaded from bert.encoder.tag_blocks.3.norm1.weight                 of shape (768,)
2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias                   loaded from bert.encoder.tag_blocks.3.norm2.bias                   of shape (768,)
2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight                 loaded from bert.encoder.tag_blocks.3.norm2.weight                 of shape (768,)
2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias                   loaded from bert.extra_embeddings.LayerNorm.bias                   of shape (768,)
2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight                 loaded from bert.extra_embeddings.LayerNorm.weight                 of shape (768,)
2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight       loaded from bert.extra_embeddings.position_embeddings.weight       of shape (512, 768)
2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight     loaded from bert.extra_embeddings.token_type_embeddings.weight     of shape (2, 768)
2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight           loaded from bert.extra_embeddings.word_embeddings.weight           of shape (30522, 768)
2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias                                 loaded from bert.pooler.dense.bias                                 of shape (768,)
2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight                               loaded from bert.pooler.dense.weight                               of shape (768, 768)
2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias                        loaded from bert.tag_logit.predictions.bias                        of shape (30522,)
2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight              loaded from bert.tag_logit.predictions.decoder.weight              of shape (30522, 768)
2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias    loaded from bert.tag_logit.predictions.transform.LayerNorm.bias    of shape (768,)
2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight  loaded from bert.tag_logit.predictions.transform.LayerNorm.weight  of shape (768,)
2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias        loaded from bert.tag_logit.predictions.transform.dense.bias        of shape (768,)
2022-03-17 13:57:29,301.301 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight      loaded from bert.tag_logit.predictions.transform.dense.weight      of shape (768, 768)
2022-03-17 13:57:29,301.301 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias                                   loaded from cls.predictions.bias                                   of shape (30522,)
2022-03-17 13:57:29,301.301 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight                         loaded from cls.predictions.decoder.weight                         of shape (30522, 768)
2022-03-17 13:57:29,301.301 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias               loaded from cls.predictions.transform.LayerNorm.bias               of shape (768,)
2022-03-17 13:57:29,301.301 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight             loaded from cls.predictions.transform.LayerNorm.weight             of shape (768,)
2022-03-17 13:57:29,301.301 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias                   loaded from cls.predictions.transform.dense.bias                   of shape (768,)
2022-03-17 13:57:29,301.301 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight                 loaded from cls.predictions.transform.dense.weight                 of shape (768, 768)
2022-03-17 13:57:29,301.301 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288
2022-03-17 13:57:29,301.301 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = []
2022-03-17 13:57:29,304.304 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1
2022-03-17 13:57:29,464.464 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight']
2022-03-17 13:57:29,788.788 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = <src.qd.data_layer.transform.LoadImage object at 0x7fccae758cd0>; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False
2022-03-17 13:57:29,815.815 2829:uni_pipeline.py:509 get_data_loader(): sampler = <src.qd.data_layer.samplers.DistributedSampler object at 0x7fccb0c8be10>
2022-03-17 13:57:29,815.815 2829:uni_pipeline.py:1000    predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:57:29,887.887 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=<src.qd.data_layer.dataset.ImageIdxTSVDataset object at 0x7fccae7581d0>, transform=Compose(
    <src.qd.data_layer.transform.LoadHW object at 0x7fccae7580d0>
    Compose(
    <src.qd.data_layer.transform.LoadImage object at 0x7fccae758cd0>
    ImageTransform2Dict(image_transform=Compose(
    <src.qd.data_layer.transform.BGR2RGB object at 0x7fcba4e180d0>
    ToPILImage()
    Resize(size=384, interpolation=PIL.Image.BICUBIC)
    CenterCrop(size=(384, 384))
    ToTensor()
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
))
)
    LoadLabel(data=TaxCocoCaption, split=test, version=vinvl)
    <src.qd.data_layer.transform.IdentifyTextAB object at 0x7fcba4e18910>
    TransCaptionTensorizer(tensorizer=<src.qd.mask.data.datasets.caption_tensorizer.CaptionTensorizer object at 0x7fcbcd9b5710>, pad_to_max=True, pad_image_to_max=True)
    <src.qd.pipelines.tagger_caption_uni_pipeline_expanding.Tensorizer object at 0x7fccb0c8bf50>
    <src.qd.data_layer.transform.RemoveUselessKeys object at 0x7fccb0c8bf10>
    <src.qd.data_layer.transform.RenameKey object at 0x7fccb0c8bed0>
))

uni_pipeline.py:908:   0%|          | 0/10 [00:00<?, ?it/s]/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1639: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

uni_pipeline.py:908:  10%|█         | 1/10 [00:05<00:46,  5.19s/it]
uni_pipeline.py:908:  20%|██        | 2/10 [00:10<00:41,  5.18s/it]
uni_pipeline.py:908:  30%|███       | 3/10 [00:15<00:35,  5.01s/it]
uni_pipeline.py:908:  40%|████      | 4/10 [00:20<00:30,  5.08s/it]
uni_pipeline.py:908:  50%|█████     | 5/10 [00:25<00:24,  4.99s/it]
uni_pipeline.py:908:  60%|██████    | 6/10 [00:30<00:19,  4.94s/it]
uni_pipeline.py:908:  70%|███████   | 7/10 [00:35<00:15,  5.02s/it]
uni_pipeline.py:908:  80%|████████  | 8/10 [00:40<00:10,  5.08s/it]
uni_pipeline.py:908:  90%|█████████ | 9/10 [00:45<00:05,  5.11s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:50<00:00,  5.08s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:51<00:00,  5.15s/it]
2022-03-17 13:58:22,224.224 2829:qd_common.py:625    cmd_run(): start to cmd run: nvidia-smi
2022-03-17 13:58:24,274.274 2829:qd_common.py:670    cmd_run(): finished the cmd run
2022-03-17 13:58:24,901.901 2829:uni_pipeline.py:1008    predict(): data: 0.0003 (0.0533)  input_to_cuda: 0.0029 (0.0033)  model: 5.0108 (5.0211)  write: 0.0046 (0.0046)
2022-03-17 13:58:42,399.399 2829:qd_common.py:1157 concat_files(): concating 0/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:58:42,400.400 2829:qd_common.py:1157 concat_files(): concating 1/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_1_32.tsv
2022-03-17 13:58:42,401.401 2829:qd_common.py:1157 concat_files(): concating 2/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_2_32.tsv
2022-03-17 13:58:42,401.401 2829:qd_common.py:1157 concat_files(): concating 3/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_3_32.tsv
2022-03-17 13:58:42,402.402 2829:qd_common.py:1157 concat_files(): concating 4/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_4_32.tsv
2022-03-17 13:58:42,402.402 2829:qd_common.py:1157 concat_files(): concating 5/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_5_32.tsv
2022-03-17 13:58:42,403.403 2829:qd_common.py:1157 concat_files(): concating 6/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_6_32.tsv
2022-03-17 13:58:42,403.403 2829:qd_common.py:1157 concat_files(): concating 7/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_7_32.tsv
2022-03-17 13:58:42,404.404 2829:qd_common.py:1157 concat_files(): concating 8/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_8_32.tsv
2022-03-17 13:58:42,474.474 2829:qd_common.py:1157 concat_files(): concating 9/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_9_32.tsv
2022-03-17 13:58:42,545.545 2829:qd_common.py:1157 concat_files(): concating 10/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_10_32.tsv
2022-03-17 13:58:42,616.616 2829:qd_common.py:1157 concat_files(): concating 11/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_11_32.tsv
2022-03-17 13:58:42,687.687 2829:qd_common.py:1157 concat_files(): concating 12/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_12_32.tsv
2022-03-17 13:58:42,757.757 2829:qd_common.py:1157 concat_files(): concating 13/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_13_32.tsv
2022-03-17 13:58:42,831.831 2829:qd_common.py:1157 concat_files(): concating 14/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_14_32.tsv
2022-03-17 13:58:42,901.901 2829:qd_common.py:1157 concat_files(): concating 15/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_15_32.tsv
2022-03-17 13:58:42,972.972 2829:qd_common.py:1157 concat_files(): concating 16/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_16_32.tsv
2022-03-17 13:58:43,043.043 2829:qd_common.py:1157 concat_files(): concating 17/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_17_32.tsv
2022-03-17 13:58:43,116.116 2829:qd_common.py:1157 concat_files(): concating 18/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_18_32.tsv
2022-03-17 13:58:43,188.188 2829:qd_common.py:1157 concat_files(): concating 19/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_19_32.tsv
2022-03-17 13:58:43,259.259 2829:qd_common.py:1157 concat_files(): concating 20/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_20_32.tsv
2022-03-17 13:58:43,330.330 2829:qd_common.py:1157 concat_files(): concating 21/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_21_32.tsv
2022-03-17 13:58:43,400.400 2829:qd_common.py:1157 concat_files(): concating 22/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_22_32.tsv
2022-03-17 13:58:43,471.471 2829:qd_common.py:1157 concat_files(): concating 23/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_23_32.tsv
2022-03-17 13:58:43,544.544 2829:qd_common.py:1157 concat_files(): concating 24/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_24_32.tsv
2022-03-17 13:58:43,616.616 2829:qd_common.py:1157 concat_files(): concating 25/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_25_32.tsv
2022-03-17 13:58:43,687.687 2829:qd_common.py:1157 concat_files(): concating 26/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_26_32.tsv
2022-03-17 13:58:43,757.757 2829:qd_common.py:1157 concat_files(): concating 27/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_27_32.tsv
2022-03-17 13:58:43,829.829 2829:qd_common.py:1157 concat_files(): concating 28/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_28_32.tsv
2022-03-17 13:58:43,899.899 2829:qd_common.py:1157 concat_files(): concating 29/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_29_32.tsv
2022-03-17 13:58:43,971.971 2829:qd_common.py:1157 concat_files(): concating 30/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_30_32.tsv
2022-03-17 13:58:44,041.041 2829:qd_common.py:1157 concat_files(): concating 31/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_31_32.tsv
2022-03-17 13:58:46,325.325 2829:tsv_io.py:66 reorder_tsv_keys(): loading keys in input
2022-03-17 13:58:46,352.352 2829:tsv_io.py:328 _ensure_lineidx_loaded(): loaded 5024 from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv.before.reorder.lineidx

tsv_io.py:67:   0%|          | 0/5024 [00:00<?, ?it/s]
tsv_io.py:67: 100%|██████████| 5024/5024 [00:00<00:00, 102016.81it/s]
2022-03-17 13:58:46,474.474 2829:tsv_io.py:70   gen_rows(): writing

tsv_io.py:71:   0%|          | 0/5000 [00:00<?, ?it/s]
tsv_io.py:71: 100%|██████████| 5000/5000 [00:00<00:00, 77361.70it/s]
2022-03-17 13:59:03,572.572 2829:qd_common.py:625    cmd_run(): start to cmd run: rsync output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv.speed.vis.txt.tmp --progress
model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt

         32,768  33%    0.00kB/s    0:00:00  
         98,002 100%   62.21MB/s    0:00:00 (xfr#1, to-chk=0/1)
loading annotations into memory...
0:00:00.022432
creating index...
index created!
Loading and preparing results...     
DONE (t=0.01s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 307086 tokens at 1416698.54 tokens per second.
PTBTokenizer tokenized 59090 tokens at 502065.68 tokens per second.
setting up scorers...
computing Bleu score...
{'testlen': 49099, 'reflen': 48429, 'guess': [49099, 44099, 39099, 34099], 'correct': [37924, 21318, 10644, 5134]}
ratio: 1.0138346858287182
Bleu_1: 0.772
Bleu_2: 0.611
Bleu_3: 0.467
Bleu_4: 0.352
computing METEOR score...
METEOR: 0.283
computing Rouge score...
ROUGE_L: 0.573
computing CIDEr score...
CIDEr: 1.195
computing SPICE score...
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.nustaq.serialization.FSTClazzInfo (file:/mnt/batch/tasks/shared/LS_root/jobs/vlp32gb/azureml/zhiyuan-pytorch-test_1647405376_841ce8bd/mounts/vig_data/jianfw/data/qd_data/coco_caption/pycocoevalcap/spice/lib/fst-2.47.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of org.nustaq.serialization.FSTClazzInfo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Parsing reference captions
Parsing test captions
Initiating Stanford parsing pipeline
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... 
done [0.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [1.0 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.4 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].
Threads( StanfordCoreNLP ) [3.61 seconds]
Warning: Nashorn engine is planned to be removed from a future JDK release
SPICE evaluation took: 17.37 s
SPICE: 0.216
2022-03-17 13:59:43,203.203 2829:tagger_caption_uni_pipeline_expanding.py:1010   evaluate(): evaluation result: {'Bleu_1': 0.772398623189866, 'Bleu_2': 0.6110540173815453, 'Bleu_3': 0.466694634357206, 'Bleu_4': 0.35172493754651785, 'METEOR': 0.28337641262799595, 'ROUGE_L': 0.5725179145048905, 'CIDEr': 1.195186995283833, 'SPICE': 0.2155907871802502}
2022-03-17 13:59:43,203.203 2829:tagger_caption_uni_pipeline_expanding.py:1011   evaluate(): evaluation result saved to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.report
2022-03-17 13:59:43,229.229 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json
2022-03-17 13:59:43,229.229 2829:modeling_utils.py:211 from_pretrained(): Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "image_captioning",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "TIMM_vit",
  "net": "vit_base_patch16_384",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pretrained": true,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
2022-03-17 13:59:43,919.919 2829:modeling_bert.py:529   __init__(): TIMM Split image encoder load from pre-trained: True
2022-03-17 13:59:44,988.988 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:59:46,478.478 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:59:47,096.096 2829:modeling_bert.py:2677   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:59:47,244.244 2829:modeling_bert.py:2688   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 13:59:48,014.014 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight!  Note that this might be replaced by pre-trained checkpoint later!
2022-03-17 13:59:48,014.014 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode.
2022-03-17 13:59:49,109.109 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 13:59:49,673.673 2829:checkpoint.py:240       load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt
2022-03-17 13:59:55,915.915 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token                                loaded from image_encoder.module.cls_token                         of shape (1, 1, 768)
2022-03-17 13:59:55,915.915 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias                                loaded from image_encoder.module.head.bias                         of shape (1000,)
2022-03-17 13:59:55,915.915 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight                              loaded from image_encoder.module.head.weight                       of shape (1000, 768)
2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias                    loaded from image_encoder.module.patch_embed.proj.bias             of shape (768,)
2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight                  loaded from image_encoder.module.patch_embed.proj.weight           of shape (768, 3, 16, 16)
2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed                                loaded from image_encoder.module.pos_embed                         of shape (1, 577, 768)
2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias                         loaded from bert.caption_pooler.dense.bias                         of shape (768,)
2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight                       loaded from bert.caption_pooler.dense.weight                       of shape (768, 768)
2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias       loaded from bert.decoder.layer.0.attention.output.dense.bias       of shape (768,)
2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight     loaded from bert.decoder.layer.0.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias           loaded from bert.decoder.layer.0.attention.self.key.bias           of shape (768,)
2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight         loaded from bert.decoder.layer.0.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias         loaded from bert.decoder.layer.0.attention.self.query.bias         of shape (768,)
2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight       loaded from bert.decoder.layer.0.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias         loaded from bert.decoder.layer.0.attention.self.value.bias         of shape (768,)
2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight       loaded from bert.decoder.layer.0.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias           loaded from bert.decoder.layer.0.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight         loaded from bert.decoder.layer.0.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias             loaded from bert.decoder.layer.0.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight           loaded from bert.decoder.layer.0.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias                 loaded from bert.decoder.layer.0.output.dense.bias                 of shape (768,)
2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight               loaded from bert.decoder.layer.0.output.dense.weight               of shape (768, 3072)
2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias       loaded from bert.decoder.layer.1.attention.output.dense.bias       of shape (768,)
2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight     loaded from bert.decoder.layer.1.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias           loaded from bert.decoder.layer.1.attention.self.key.bias           of shape (768,)
2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight         loaded from bert.decoder.layer.1.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias         loaded from bert.decoder.layer.1.attention.self.query.bias         of shape (768,)
2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight       loaded from bert.decoder.layer.1.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias         loaded from bert.decoder.layer.1.attention.self.value.bias         of shape (768,)
2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight       loaded from bert.decoder.layer.1.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias           loaded from bert.decoder.layer.1.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight         loaded from bert.decoder.layer.1.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias             loaded from bert.decoder.layer.1.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight           loaded from bert.decoder.layer.1.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias                 loaded from bert.decoder.layer.1.output.dense.bias                 of shape (768,)
2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight               loaded from bert.decoder.layer.1.output.dense.weight               of shape (768, 3072)
2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias       loaded from bert.decoder.layer.2.attention.output.dense.bias       of shape (768,)
2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight     loaded from bert.decoder.layer.2.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias           loaded from bert.decoder.layer.2.attention.self.key.bias           of shape (768,)
2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight         loaded from bert.decoder.layer.2.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias         loaded from bert.decoder.layer.2.attention.self.query.bias         of shape (768,)
2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight       loaded from bert.decoder.layer.2.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias         loaded from bert.decoder.layer.2.attention.self.value.bias         of shape (768,)
2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight       loaded from bert.decoder.layer.2.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias           loaded from bert.decoder.layer.2.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight         loaded from bert.decoder.layer.2.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias             loaded from bert.decoder.layer.2.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight           loaded from bert.decoder.layer.2.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias                 loaded from bert.decoder.layer.2.output.dense.bias                 of shape (768,)
2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight               loaded from bert.decoder.layer.2.output.dense.weight               of shape (768, 3072)
2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias       loaded from bert.decoder.layer.3.attention.output.dense.bias       of shape (768,)
2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight     loaded from bert.decoder.layer.3.attention.output.dense.weight     of shape (768, 768)
2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias           loaded from bert.decoder.layer.3.attention.self.key.bias           of shape (768,)
2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight         loaded from bert.decoder.layer.3.attention.self.key.weight         of shape (768, 768)
2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias         loaded from bert.decoder.layer.3.attention.self.query.bias         of shape (768,)
2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight       loaded from bert.decoder.layer.3.attention.self.query.weight       of shape (768, 768)
2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias         loaded from bert.decoder.layer.3.attention.self.value.bias         of shape (768,)
2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight       loaded from bert.decoder.layer.3.attention.self.value.weight       of shape (768, 768)
2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias           loaded from bert.decoder.layer.3.intermediate.dense.bias           of shape (3072,)
2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight         loaded from bert.decoder.layer.3.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias             loaded from bert.decoder.layer.3.output.LayerNorm.bias             of shape (768,)
2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight           loaded from bert.decoder.layer.3.output.LayerNorm.weight           of shape (768,)
2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias                 loaded from bert.decoder.layer.3.output.dense.bias                 of shape (768,)
2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight               loaded from bert.decoder.layer.3.output.dense.weight               of shape (768, 3072)
2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias                         loaded from bert.embeddings.LayerNorm.bias                         of shape (768,)
2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight                       loaded from bert.embeddings.LayerNorm.weight                       of shape (768,)
2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight             loaded from bert.embeddings.position_embeddings.weight             of shape (512, 768)
2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight           loaded from bert.embeddings.token_type_embeddings.weight           of shape (2, 768)
2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight                 loaded from bert.embeddings.word_embeddings.weight                 of shape (30522, 768)
2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias                   loaded from bert.encoder.blocks.0.attn.proj.bias                   of shape (768,)
2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight                 loaded from bert.encoder.blocks.0.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias                    loaded from bert.encoder.blocks.0.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight                  loaded from bert.encoder.blocks.0.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias                     loaded from bert.encoder.blocks.0.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight                   loaded from bert.encoder.blocks.0.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias                     loaded from bert.encoder.blocks.0.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight                   loaded from bert.encoder.blocks.0.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias                       loaded from bert.encoder.blocks.0.norm1.bias                       of shape (768,)
2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight                     loaded from bert.encoder.blocks.0.norm1.weight                     of shape (768,)
2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias                       loaded from bert.encoder.blocks.0.norm2.bias                       of shape (768,)
2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight                     loaded from bert.encoder.blocks.0.norm2.weight                     of shape (768,)
2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias                   loaded from bert.encoder.blocks.1.attn.proj.bias                   of shape (768,)
2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight                 loaded from bert.encoder.blocks.1.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias                    loaded from bert.encoder.blocks.1.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight                  loaded from bert.encoder.blocks.1.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias                     loaded from bert.encoder.blocks.1.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight                   loaded from bert.encoder.blocks.1.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias                     loaded from bert.encoder.blocks.1.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight                   loaded from bert.encoder.blocks.1.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias                       loaded from bert.encoder.blocks.1.norm1.bias                       of shape (768,)
2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight                     loaded from bert.encoder.blocks.1.norm1.weight                     of shape (768,)
2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias                       loaded from bert.encoder.blocks.1.norm2.bias                       of shape (768,)
2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight                     loaded from bert.encoder.blocks.1.norm2.weight                     of shape (768,)
2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias                  loaded from bert.encoder.blocks.10.attn.proj.bias                  of shape (768,)
2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight                loaded from bert.encoder.blocks.10.attn.proj.weight                of shape (768, 768)
2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias                   loaded from bert.encoder.blocks.10.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight                 loaded from bert.encoder.blocks.10.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias                    loaded from bert.encoder.blocks.10.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight                  loaded from bert.encoder.blocks.10.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias                    loaded from bert.encoder.blocks.10.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight                  loaded from bert.encoder.blocks.10.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias                      loaded from bert.encoder.blocks.10.norm1.bias                      of shape (768,)
2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight                    loaded from bert.encoder.blocks.10.norm1.weight                    of shape (768,)
2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias                      loaded from bert.encoder.blocks.10.norm2.bias                      of shape (768,)
2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight                    loaded from bert.encoder.blocks.10.norm2.weight                    of shape (768,)
2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias                  loaded from bert.encoder.blocks.11.attn.proj.bias                  of shape (768,)
2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight                loaded from bert.encoder.blocks.11.attn.proj.weight                of shape (768, 768)
2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias                   loaded from bert.encoder.blocks.11.attn.qkv.bias                   of shape (2304,)
2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight                 loaded from bert.encoder.blocks.11.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias                    loaded from bert.encoder.blocks.11.mlp.fc1.bias                    of shape (3072,)
2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight                  loaded from bert.encoder.blocks.11.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias                    loaded from bert.encoder.blocks.11.mlp.fc2.bias                    of shape (768,)
2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight                  loaded from bert.encoder.blocks.11.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias                      loaded from bert.encoder.blocks.11.norm1.bias                      of shape (768,)
2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight                    loaded from bert.encoder.blocks.11.norm1.weight                    of shape (768,)
2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias                      loaded from bert.encoder.blocks.11.norm2.bias                      of shape (768,)
2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight                    loaded from bert.encoder.blocks.11.norm2.weight                    of shape (768,)
2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias                   loaded from bert.encoder.blocks.2.attn.proj.bias                   of shape (768,)
2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight                 loaded from bert.encoder.blocks.2.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias                    loaded from bert.encoder.blocks.2.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight                  loaded from bert.encoder.blocks.2.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias                     loaded from bert.encoder.blocks.2.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight                   loaded from bert.encoder.blocks.2.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias                     loaded from bert.encoder.blocks.2.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight                   loaded from bert.encoder.blocks.2.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias                       loaded from bert.encoder.blocks.2.norm1.bias                       of shape (768,)
2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight                     loaded from bert.encoder.blocks.2.norm1.weight                     of shape (768,)
2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias                       loaded from bert.encoder.blocks.2.norm2.bias                       of shape (768,)
2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight                     loaded from bert.encoder.blocks.2.norm2.weight                     of shape (768,)
2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias                   loaded from bert.encoder.blocks.3.attn.proj.bias                   of shape (768,)
2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight                 loaded from bert.encoder.blocks.3.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias                    loaded from bert.encoder.blocks.3.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight                  loaded from bert.encoder.blocks.3.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias                     loaded from bert.encoder.blocks.3.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight                   loaded from bert.encoder.blocks.3.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias                     loaded from bert.encoder.blocks.3.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight                   loaded from bert.encoder.blocks.3.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias                       loaded from bert.encoder.blocks.3.norm1.bias                       of shape (768,)
2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight                     loaded from bert.encoder.blocks.3.norm1.weight                     of shape (768,)
2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias                       loaded from bert.encoder.blocks.3.norm2.bias                       of shape (768,)
2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight                     loaded from bert.encoder.blocks.3.norm2.weight                     of shape (768,)
2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias                   loaded from bert.encoder.blocks.4.attn.proj.bias                   of shape (768,)
2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight                 loaded from bert.encoder.blocks.4.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias                    loaded from bert.encoder.blocks.4.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight                  loaded from bert.encoder.blocks.4.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias                     loaded from bert.encoder.blocks.4.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight                   loaded from bert.encoder.blocks.4.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias                     loaded from bert.encoder.blocks.4.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight                   loaded from bert.encoder.blocks.4.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias                       loaded from bert.encoder.blocks.4.norm1.bias                       of shape (768,)
2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight                     loaded from bert.encoder.blocks.4.norm1.weight                     of shape (768,)
2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias                       loaded from bert.encoder.blocks.4.norm2.bias                       of shape (768,)
2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight                     loaded from bert.encoder.blocks.4.norm2.weight                     of shape (768,)
2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias                   loaded from bert.encoder.blocks.5.attn.proj.bias                   of shape (768,)
2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight                 loaded from bert.encoder.blocks.5.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias                    loaded from bert.encoder.blocks.5.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight                  loaded from bert.encoder.blocks.5.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias                     loaded from bert.encoder.blocks.5.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight                   loaded from bert.encoder.blocks.5.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias                     loaded from bert.encoder.blocks.5.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight                   loaded from bert.encoder.blocks.5.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias                       loaded from bert.encoder.blocks.5.norm1.bias                       of shape (768,)
2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight                     loaded from bert.encoder.blocks.5.norm1.weight                     of shape (768,)
2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias                       loaded from bert.encoder.blocks.5.norm2.bias                       of shape (768,)
2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight                     loaded from bert.encoder.blocks.5.norm2.weight                     of shape (768,)
2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias                   loaded from bert.encoder.blocks.6.attn.proj.bias                   of shape (768,)
2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight                 loaded from bert.encoder.blocks.6.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias                    loaded from bert.encoder.blocks.6.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight                  loaded from bert.encoder.blocks.6.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias                     loaded from bert.encoder.blocks.6.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight                   loaded from bert.encoder.blocks.6.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias                     loaded from bert.encoder.blocks.6.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight                   loaded from bert.encoder.blocks.6.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias                       loaded from bert.encoder.blocks.6.norm1.bias                       of shape (768,)
2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight                     loaded from bert.encoder.blocks.6.norm1.weight                     of shape (768,)
2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias                       loaded from bert.encoder.blocks.6.norm2.bias                       of shape (768,)
2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight                     loaded from bert.encoder.blocks.6.norm2.weight                     of shape (768,)
2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias                   loaded from bert.encoder.blocks.7.attn.proj.bias                   of shape (768,)
2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight                 loaded from bert.encoder.blocks.7.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias                    loaded from bert.encoder.blocks.7.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight                  loaded from bert.encoder.blocks.7.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias                     loaded from bert.encoder.blocks.7.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight                   loaded from bert.encoder.blocks.7.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias                     loaded from bert.encoder.blocks.7.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight                   loaded from bert.encoder.blocks.7.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias                       loaded from bert.encoder.blocks.7.norm1.bias                       of shape (768,)
2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight                     loaded from bert.encoder.blocks.7.norm1.weight                     of shape (768,)
2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias                       loaded from bert.encoder.blocks.7.norm2.bias                       of shape (768,)
2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight                     loaded from bert.encoder.blocks.7.norm2.weight                     of shape (768,)
2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias                   loaded from bert.encoder.blocks.8.attn.proj.bias                   of shape (768,)
2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight                 loaded from bert.encoder.blocks.8.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias                    loaded from bert.encoder.blocks.8.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight                  loaded from bert.encoder.blocks.8.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias                     loaded from bert.encoder.blocks.8.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight                   loaded from bert.encoder.blocks.8.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias                     loaded from bert.encoder.blocks.8.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight                   loaded from bert.encoder.blocks.8.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias                       loaded from bert.encoder.blocks.8.norm1.bias                       of shape (768,)
2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight                     loaded from bert.encoder.blocks.8.norm1.weight                     of shape (768,)
2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias                       loaded from bert.encoder.blocks.8.norm2.bias                       of shape (768,)
2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight                     loaded from bert.encoder.blocks.8.norm2.weight                     of shape (768,)
2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias                   loaded from bert.encoder.blocks.9.attn.proj.bias                   of shape (768,)
2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight                 loaded from bert.encoder.blocks.9.attn.proj.weight                 of shape (768, 768)
2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias                    loaded from bert.encoder.blocks.9.attn.qkv.bias                    of shape (2304,)
2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight                  loaded from bert.encoder.blocks.9.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias                     loaded from bert.encoder.blocks.9.mlp.fc1.bias                     of shape (3072,)
2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight                   loaded from bert.encoder.blocks.9.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias                     loaded from bert.encoder.blocks.9.mlp.fc2.bias                     of shape (768,)
2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight                   loaded from bert.encoder.blocks.9.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias                       loaded from bert.encoder.blocks.9.norm1.bias                       of shape (768,)
2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight                     loaded from bert.encoder.blocks.9.norm1.weight                     of shape (768,)
2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias                       loaded from bert.encoder.blocks.9.norm2.bias                       of shape (768,)
2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight                     loaded from bert.encoder.blocks.9.norm2.weight                     of shape (768,)
2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias               loaded from bert.encoder.tag_blocks.0.attn.proj.bias               of shape (768,)
2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight             loaded from bert.encoder.tag_blocks.0.attn.proj.weight             of shape (768, 768)
2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias                loaded from bert.encoder.tag_blocks.0.attn.qkv.bias                of shape (2304,)
2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight              loaded from bert.encoder.tag_blocks.0.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias                   loaded from bert.encoder.tag_blocks.0.norm1.bias                   of shape (768,)
2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight                 loaded from bert.encoder.tag_blocks.0.norm1.weight                 of shape (768,)
2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias                   loaded from bert.encoder.tag_blocks.0.norm2.bias                   of shape (768,)
2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight                 loaded from bert.encoder.tag_blocks.0.norm2.weight                 of shape (768,)
2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias               loaded from bert.encoder.tag_blocks.1.attn.proj.bias               of shape (768,)
2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight             loaded from bert.encoder.tag_blocks.1.attn.proj.weight             of shape (768, 768)
2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias                loaded from bert.encoder.tag_blocks.1.attn.qkv.bias                of shape (2304,)
2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight              loaded from bert.encoder.tag_blocks.1.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias                   loaded from bert.encoder.tag_blocks.1.norm1.bias                   of shape (768,)
2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight                 loaded from bert.encoder.tag_blocks.1.norm1.weight                 of shape (768,)
2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias                   loaded from bert.encoder.tag_blocks.1.norm2.bias                   of shape (768,)
2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight                 loaded from bert.encoder.tag_blocks.1.norm2.weight                 of shape (768,)
2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias               loaded from bert.encoder.tag_blocks.2.attn.proj.bias               of shape (768,)
2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight             loaded from bert.encoder.tag_blocks.2.attn.proj.weight             of shape (768, 768)
2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias                loaded from bert.encoder.tag_blocks.2.attn.qkv.bias                of shape (2304,)
2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight              loaded from bert.encoder.tag_blocks.2.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias                   loaded from bert.encoder.tag_blocks.2.norm1.bias                   of shape (768,)
2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight                 loaded from bert.encoder.tag_blocks.2.norm1.weight                 of shape (768,)
2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias                   loaded from bert.encoder.tag_blocks.2.norm2.bias                   of shape (768,)
2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight                 loaded from bert.encoder.tag_blocks.2.norm2.weight                 of shape (768,)
2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias               loaded from bert.encoder.tag_blocks.3.attn.proj.bias               of shape (768,)
2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight             loaded from bert.encoder.tag_blocks.3.attn.proj.weight             of shape (768, 768)
2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias                loaded from bert.encoder.tag_blocks.3.attn.qkv.bias                of shape (2304,)
2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight              loaded from bert.encoder.tag_blocks.3.attn.qkv.weight              of shape (2304, 768)
2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias                 of shape (3072,)
2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias                 of shape (768,)
2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias                   loaded from bert.encoder.tag_blocks.3.norm1.bias                   of shape (768,)
2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight                 loaded from bert.encoder.tag_blocks.3.norm1.weight                 of shape (768,)
2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias                   loaded from bert.encoder.tag_blocks.3.norm2.bias                   of shape (768,)
2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight                 loaded from bert.encoder.tag_blocks.3.norm2.weight                 of shape (768,)
2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias                   loaded from bert.extra_embeddings.LayerNorm.bias                   of shape (768,)
2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight                 loaded from bert.extra_embeddings.LayerNorm.weight                 of shape (768,)
2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight       loaded from bert.extra_embeddings.position_embeddings.weight       of shape (512, 768)
2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight     loaded from bert.extra_embeddings.token_type_embeddings.weight     of shape (2, 768)
2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight           loaded from bert.extra_embeddings.word_embeddings.weight           of shape (30522, 768)
2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias                                 loaded from bert.pooler.dense.bias                                 of shape (768,)
2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight                               loaded from bert.pooler.dense.weight                               of shape (768, 768)
2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias                        loaded from bert.tag_logit.predictions.bias                        of shape (30522,)
2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight              loaded from bert.tag_logit.predictions.decoder.weight              of shape (30522, 768)
2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias    loaded from bert.tag_logit.predictions.transform.LayerNorm.bias    of shape (768,)
2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight  loaded from bert.tag_logit.predictions.transform.LayerNorm.weight  of shape (768,)
2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias        loaded from bert.tag_logit.predictions.transform.dense.bias        of shape (768,)
2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight      loaded from bert.tag_logit.predictions.transform.dense.weight      of shape (768, 768)
2022-03-17 13:59:55,937.937 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias                                   loaded from cls.predictions.bias                                   of shape (30522,)
2022-03-17 13:59:55,937.937 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight                         loaded from cls.predictions.decoder.weight                         of shape (30522, 768)
2022-03-17 13:59:55,937.937 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias               loaded from cls.predictions.transform.LayerNorm.bias               of shape (768,)
2022-03-17 13:59:55,937.937 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight             loaded from cls.predictions.transform.LayerNorm.weight             of shape (768,)
2022-03-17 13:59:55,937.937 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias                   loaded from cls.predictions.transform.dense.bias                   of shape (768,)
2022-03-17 13:59:55,937.937 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight                 loaded from cls.predictions.transform.dense.weight                 of shape (768, 768)
2022-03-17 13:59:55,937.937 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288
2022-03-17 13:59:55,937.937 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = []
2022-03-17 13:59:55,940.940 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1
2022-03-17 13:59:56,100.100 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight']
2022-03-17 13:59:56,435.435 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = <src.qd.data_layer.transform.LoadImage object at 0x7fcc85243d90>; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False
2022-03-17 13:59:56,463.463 2829:uni_pipeline.py:509 get_data_loader(): sampler = <src.qd.data_layer.samplers.DistributedSampler object at 0x7fccb0d95990>
2022-03-17 13:59:56,464.464 2829:uni_pipeline.py:1000    predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 13:59:56,540.540 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=<src.qd.data_layer.dataset.ImageIdxTSVDataset object at 0x7fcc85334bd0>, transform=Compose(
    <src.qd.data_layer.transform.LoadHW object at 0x7fcc85334910>
    Compose(
    <src.qd.data_layer.transform.LoadImage object at 0x7fcc85243d90>
    ImageTransform2Dict(image_transform=Compose(
    <src.qd.data_layer.transform.BGR2RGB object at 0x7fcc84544050>
    ToPILImage()
    Resize(size=384, interpolation=PIL.Image.BICUBIC)
    CenterCrop(size=(384, 384))
    ToTensor()
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
))
)
    LoadLabel(data=TaxCocoCaption, split=test, version=vinvl)
    <src.qd.data_layer.transform.IdentifyTextAB object at 0x7fcc84544610>
    TransCaptionTensorizer(tensorizer=<src.qd.mask.data.datasets.caption_tensorizer.CaptionTensorizer object at 0x7fcbcd9b5710>, pad_to_max=True, pad_image_to_max=True)
    <src.qd.pipelines.tagger_caption_uni_pipeline_expanding.Tensorizer object at 0x7fccb0d95ad0>
    <src.qd.data_layer.transform.RemoveUselessKeys object at 0x7fccb0d95a90>
    <src.qd.data_layer.transform.RenameKey object at 0x7fccb0d95a50>
))

uni_pipeline.py:908:   0%|          | 0/10 [00:00<?, ?it/s]/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1639: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

uni_pipeline.py:908:  10%|█         | 1/10 [00:05<00:46,  5.17s/it]
uni_pipeline.py:908:  20%|██        | 2/10 [00:10<00:42,  5.37s/it]
uni_pipeline.py:908:  30%|███       | 3/10 [00:15<00:35,  5.12s/it]
uni_pipeline.py:908:  40%|████      | 4/10 [00:20<00:30,  5.14s/it]
uni_pipeline.py:908:  50%|█████     | 5/10 [00:25<00:25,  5.03s/it]
uni_pipeline.py:908:  60%|██████    | 6/10 [00:30<00:20,  5.09s/it]
uni_pipeline.py:908:  70%|███████   | 7/10 [00:35<00:15,  5.12s/it]
uni_pipeline.py:908:  80%|████████  | 8/10 [00:40<00:10,  5.04s/it]
uni_pipeline.py:908:  90%|█████████ | 9/10 [00:46<00:05,  5.19s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:50<00:00,  4.89s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:51<00:00,  5.18s/it]
2022-03-17 14:00:49,401.401 2829:qd_common.py:625    cmd_run(): start to cmd run: nvidia-smi
2022-03-17 14:00:50,478.478 2829:qd_common.py:670    cmd_run(): finished the cmd run
2022-03-17 14:00:50,962.962 2829:uni_pipeline.py:1008    predict(): data: 0.0003 (0.0534)  input_to_cuda: 0.0029 (0.0030)  model: 4.8507 (5.0110)  write: 0.0045 (0.0045)
2022-03-17 14:01:09,130.130 2829:qd_common.py:1157 concat_files(): concating 0/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 14:01:09,131.131 2829:qd_common.py:1157 concat_files(): concating 1/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_1_32.tsv
2022-03-17 14:01:09,132.132 2829:qd_common.py:1157 concat_files(): concating 2/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_2_32.tsv
2022-03-17 14:01:09,132.132 2829:qd_common.py:1157 concat_files(): concating 3/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_3_32.tsv
2022-03-17 14:01:09,133.133 2829:qd_common.py:1157 concat_files(): concating 4/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_4_32.tsv
2022-03-17 14:01:09,133.133 2829:qd_common.py:1157 concat_files(): concating 5/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_5_32.tsv
2022-03-17 14:01:09,134.134 2829:qd_common.py:1157 concat_files(): concating 6/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_6_32.tsv
2022-03-17 14:01:09,134.134 2829:qd_common.py:1157 concat_files(): concating 7/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_7_32.tsv
2022-03-17 14:01:09,135.135 2829:qd_common.py:1157 concat_files(): concating 8/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_8_32.tsv
2022-03-17 14:01:09,205.205 2829:qd_common.py:1157 concat_files(): concating 9/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_9_32.tsv
2022-03-17 14:01:09,278.278 2829:qd_common.py:1157 concat_files(): concating 10/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_10_32.tsv
2022-03-17 14:01:09,348.348 2829:qd_common.py:1157 concat_files(): concating 11/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_11_32.tsv
2022-03-17 14:01:09,420.420 2829:qd_common.py:1157 concat_files(): concating 12/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_12_32.tsv
2022-03-17 14:01:09,490.490 2829:qd_common.py:1157 concat_files(): concating 13/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_13_32.tsv
2022-03-17 14:01:09,561.561 2829:qd_common.py:1157 concat_files(): concating 14/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_14_32.tsv
2022-03-17 14:01:09,632.632 2829:qd_common.py:1157 concat_files(): concating 15/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_15_32.tsv
2022-03-17 14:01:09,704.704 2829:qd_common.py:1157 concat_files(): concating 16/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_16_32.tsv
2022-03-17 14:01:09,775.775 2829:qd_common.py:1157 concat_files(): concating 17/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_17_32.tsv
2022-03-17 14:01:09,849.849 2829:qd_common.py:1157 concat_files(): concating 18/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_18_32.tsv
2022-03-17 14:01:09,919.919 2829:qd_common.py:1157 concat_files(): concating 19/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_19_32.tsv
2022-03-17 14:01:09,989.989 2829:qd_common.py:1157 concat_files(): concating 20/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_20_32.tsv
2022-03-17 14:01:10,060.060 2829:qd_common.py:1157 concat_files(): concating 21/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_21_32.tsv
2022-03-17 14:01:10,131.131 2829:qd_common.py:1157 concat_files(): concating 22/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_22_32.tsv
2022-03-17 14:01:10,202.202 2829:qd_common.py:1157 concat_files(): concating 23/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_23_32.tsv
2022-03-17 14:01:10,275.275 2829:qd_common.py:1157 concat_files(): concating 24/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_24_32.tsv
2022-03-17 14:01:10,345.345 2829:qd_common.py:1157 concat_files(): concating 25/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_25_32.tsv
2022-03-17 14:01:10,418.418 2829:qd_common.py:1157 concat_files(): concating 26/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_26_32.tsv
2022-03-17 14:01:10,489.489 2829:qd_common.py:1157 concat_files(): concating 27/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_27_32.tsv
2022-03-17 14:01:10,561.561 2829:qd_common.py:1157 concat_files(): concating 28/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_28_32.tsv
2022-03-17 14:01:10,631.631 2829:qd_common.py:1157 concat_files(): concating 29/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_29_32.tsv
2022-03-17 14:01:10,703.703 2829:qd_common.py:1157 concat_files(): concating 30/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_30_32.tsv
2022-03-17 14:01:10,773.773 2829:qd_common.py:1157 concat_files(): concating 31/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_31_32.tsv
2022-03-17 14:01:13,030.030 2829:tsv_io.py:66 reorder_tsv_keys(): loading keys in input
2022-03-17 14:01:13,056.056 2829:tsv_io.py:328 _ensure_lineidx_loaded(): loaded 5024 from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv.before.reorder.lineidx

tsv_io.py:67:   0%|          | 0/5024 [00:00<?, ?it/s]
tsv_io.py:67: 100%|██████████| 5024/5024 [00:00<00:00, 101185.50it/s]
2022-03-17 14:01:13,179.179 2829:tsv_io.py:70   gen_rows(): writing

tsv_io.py:71:   0%|          | 0/5000 [00:00<?, ?it/s]
tsv_io.py:71: 100%|██████████| 5000/5000 [00:00<00:00, 73189.32it/s]
2022-03-17 14:01:30,220.220 2829:qd_common.py:625    cmd_run(): start to cmd run: rsync output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv.speed.vis.txt.tmp --progress
model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt

         32,768  33%    0.00kB/s    0:00:00  
         98,037 100%   62.25MB/s    0:00:00 (xfr#1, to-chk=0/1)
loading annotations into memory...
0:00:00.022760
creating index...
index created!
Loading and preparing results...     
DONE (t=0.01s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 307086 tokens at 1421161.52 tokens per second.
PTBTokenizer tokenized 59241 tokens at 499173.93 tokens per second.
setting up scorers...
computing Bleu score...
{'testlen': 49270, 'reflen': 48566, 'guess': [49270, 44270, 39270, 34270], 'correct': [38058, 21395, 10691, 5146]}
ratio: 1.0144957377589052
Bleu_1: 0.772
Bleu_2: 0.611
Bleu_3: 0.467
Bleu_4: 0.351
computing METEOR score...
METEOR: 0.284
computing Rouge score...
ROUGE_L: 0.573
computing CIDEr score...
CIDEr: 1.201
computing SPICE score...
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.nustaq.serialization.FSTClazzInfo (file:/mnt/batch/tasks/shared/LS_root/jobs/vlp32gb/azureml/zhiyuan-pytorch-test_1647405376_841ce8bd/mounts/vig_data/jianfw/data/qd_data/coco_caption/pycocoevalcap/spice/lib/fst-2.47.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of org.nustaq.serialization.FSTClazzInfo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Parsing reference captions
Parsing test captions
Initiating Stanford parsing pipeline
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... 
done [0.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [0.9 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.5 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].
Threads( StanfordCoreNLP ) [4.73 seconds]
Warning: Nashorn engine is planned to be removed from a future JDK release
SPICE evaluation took: 18.01 s
SPICE: 0.219
2022-03-17 14:02:10,814.814 2829:tagger_caption_uni_pipeline_expanding.py:1010   evaluate(): evaluation result: {'Bleu_1': 0.7724375887964122, 'Bleu_2': 0.6109885677072527, 'Bleu_3': 0.466667832095126, 'Bleu_4': 0.35147530665572013, 'METEOR': 0.28412220067567123, 'ROUGE_L': 0.5728418401813248, 'CIDEr': 1.2010215711042493, 'SPICE': 0.21859051418539602}
2022-03-17 14:02:10,814.814 2829:tagger_caption_uni_pipeline_expanding.py:1011   evaluate(): evaluation result saved to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.report
2022-03-17 14:02:10,840.840 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json
2022-03-17 14:02:10,840.840 2829:modeling_utils.py:211 from_pretrained(): Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "image_captioning",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "TIMM_vit",
  "net": "vit_base_patch16_384",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pretrained": true,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
2022-03-17 14:02:11,526.526 2829:modeling_bert.py:529   __init__(): TIMM Split image encoder load from pre-trained: True
2022-03-17 14:02:12,599.599 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 14:02:14,102.102 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 14:02:14,728.728 2829:modeling_bert.py:2677   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 14:02:14,876.876 2829:modeling_bert.py:2688   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 14:02:15,646.646 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight!  Note that this might be replaced by pre-trained checkpoint later!
2022-03-17 14:02:15,646.646 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode.
2022-03-17 14:02:16,736.736 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 14:02:17,294.294 2829:checkpoint.py:240       load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt
2022-03-17 14:02:23,632.632 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token                                loaded from image_encoder.module.cls_token                         of shape (1, 1, 768)
2022-03-17 14:02:23,632.632 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias                                loaded from image_encoder.module.head.bias                         of shape (1000,)
2022-03-17 14:02:23,632.632 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight                              loaded from image_encoder.module.head.weight                       of shape (1000, 768)
2022-03-17 14:02:23,632.632 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias                    loaded from image_encoder.module.patch_embed.proj.bias             of shape (768,)
2022-03-17 14:02:23,632.632 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight                  loaded from image_encoder.module.patch_embed.proj.weight           of shape (768, 3, 16, 16)
2022-03-17 14:02:23,632.632 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed                                loaded from image_encoder.module.pos_embed                         of shape (1, 577, 768)
2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias                         loaded from bert.caption_pooler.dense.bias                         of shape (768,)
2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight                       loaded from bert.caption_pooler.dense.weight                       of shape (768, 768)
2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias       loaded from bert.decoder.layer.0.attention.output.dense.bias       of shape (768,)
2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight     loaded from bert.decoder.layer.0.attention.output.dense.weight     of shape (768, 768)
2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias           loaded from bert.decoder.layer.0.attention.self.key.bias           of shape (768,)
2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight         loaded from bert.decoder.layer.0.attention.self.key.weight         of shape (768, 768)
2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias         loaded from bert.decoder.layer.0.attention.self.query.bias         of shape (768,)
2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight       loaded from bert.decoder.layer.0.attention.self.query.weight       of shape (768, 768)
2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias         loaded from bert.decoder.layer.0.attention.self.value.bias         of shape (768,)
2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight       loaded from bert.decoder.layer.0.attention.self.value.weight       of shape (768, 768)
2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias           loaded from bert.decoder.layer.0.intermediate.dense.bias           of shape (3072,)
2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight         loaded from bert.decoder.layer.0.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias             loaded from bert.decoder.layer.0.output.LayerNorm.bias             of shape (768,)
2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight           loaded from bert.decoder.layer.0.output.LayerNorm.weight           of shape (768,)
2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias                 loaded from bert.decoder.layer.0.output.dense.bias                 of shape (768,)
2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight               loaded from bert.decoder.layer.0.output.dense.weight               of shape (768, 3072)
2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias       loaded from bert.decoder.layer.1.attention.output.dense.bias       of shape (768,)
2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight     loaded from bert.decoder.layer.1.attention.output.dense.weight     of shape (768, 768)
2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias           loaded from bert.decoder.layer.1.attention.self.key.bias           of shape (768,)
2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight         loaded from bert.decoder.layer.1.attention.self.key.weight         of shape (768, 768)
2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias         loaded from bert.decoder.layer.1.attention.self.query.bias         of shape (768,)
2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight       loaded from bert.decoder.layer.1.attention.self.query.weight       of shape (768, 768)
2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias         loaded from bert.decoder.layer.1.attention.self.value.bias         of shape (768,)
2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight       loaded from bert.decoder.layer.1.attention.self.value.weight       of shape (768, 768)
2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias           loaded from bert.decoder.layer.1.intermediate.dense.bias           of shape (3072,)
2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight         loaded from bert.decoder.layer.1.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias             loaded from bert.decoder.layer.1.output.LayerNorm.bias             of shape (768,)
2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight           loaded from bert.decoder.layer.1.output.LayerNorm.weight           of shape (768,)
2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias                 loaded from bert.decoder.layer.1.output.dense.bias                 of shape (768,)
2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight               loaded from bert.decoder.layer.1.output.dense.weight               of shape (768, 3072)
2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias       loaded from bert.decoder.layer.2.attention.output.dense.bias       of shape (768,)
2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight     loaded from bert.decoder.layer.2.attention.output.dense.weight     of shape (768, 768)
2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias           loaded from bert.decoder.layer.2.attention.self.key.bias           of shape (768,)
2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight         loaded from bert.decoder.layer.2.attention.self.key.weight         of shape (768, 768)
2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias         loaded from bert.decoder.layer.2.attention.self.query.bias         of shape (768,)
2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight       loaded from bert.decoder.layer.2.attention.self.query.weight       of shape (768, 768)
2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias         loaded from bert.decoder.layer.2.attention.self.value.bias         of shape (768,)
2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight       loaded from bert.decoder.layer.2.attention.self.value.weight       of shape (768, 768)
2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias           loaded from bert.decoder.layer.2.intermediate.dense.bias           of shape (3072,)
2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight         loaded from bert.decoder.layer.2.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias             loaded from bert.decoder.layer.2.output.LayerNorm.bias             of shape (768,)
2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight           loaded from bert.decoder.layer.2.output.LayerNorm.weight           of shape (768,)
2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias                 loaded from bert.decoder.layer.2.output.dense.bias                 of shape (768,)
2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight               loaded from bert.decoder.layer.2.output.dense.weight               of shape (768, 3072)
2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias       loaded from bert.decoder.layer.3.attention.output.dense.bias       of shape (768,)
2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight     loaded from bert.decoder.layer.3.attention.output.dense.weight     of shape (768, 768)
2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias           loaded from bert.decoder.layer.3.attention.self.key.bias           of shape (768,)
2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight         loaded from bert.decoder.layer.3.attention.self.key.weight         of shape (768, 768)
2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias         loaded from bert.decoder.layer.3.attention.self.query.bias         of shape (768,)
2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight       loaded from bert.decoder.layer.3.attention.self.query.weight       of shape (768, 768)
2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias         loaded from bert.decoder.layer.3.attention.self.value.bias         of shape (768,)
2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight       loaded from bert.decoder.layer.3.attention.self.value.weight       of shape (768, 768)
2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias           loaded from bert.decoder.layer.3.intermediate.dense.bias           of shape (3072,)
2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight         loaded from bert.decoder.layer.3.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias             loaded from bert.decoder.layer.3.output.LayerNorm.bias             of shape (768,)
2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight           loaded from bert.decoder.layer.3.output.LayerNorm.weight           of shape (768,)
2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias                 loaded from bert.decoder.layer.3.output.dense.bias                 of shape (768,)
2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight               loaded from bert.decoder.layer.3.output.dense.weight               of shape (768, 3072)
2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias                         loaded from bert.embeddings.LayerNorm.bias                         of shape (768,)
2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight                       loaded from bert.embeddings.LayerNorm.weight                       of shape (768,)
2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight             loaded from bert.embeddings.position_embeddings.weight             of shape (512, 768)
2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight           loaded from bert.embeddings.token_type_embeddings.weight           of shape (2, 768)
2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight                 loaded from bert.embeddings.word_embeddings.weight                 of shape (30522, 768)
2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias                   loaded from bert.encoder.blocks.0.attn.proj.bias                   of shape (768,)
2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight                 loaded from bert.encoder.blocks.0.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias                    loaded from bert.encoder.blocks.0.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight                  loaded from bert.encoder.blocks.0.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias                     loaded from bert.encoder.blocks.0.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight                   loaded from bert.encoder.blocks.0.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias                     loaded from bert.encoder.blocks.0.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight                   loaded from bert.encoder.blocks.0.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias                       loaded from bert.encoder.blocks.0.norm1.bias                       of shape (768,)
2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight                     loaded from bert.encoder.blocks.0.norm1.weight                     of shape (768,)
2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias                       loaded from bert.encoder.blocks.0.norm2.bias                       of shape (768,)
2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight                     loaded from bert.encoder.blocks.0.norm2.weight                     of shape (768,)
2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias                   loaded from bert.encoder.blocks.1.attn.proj.bias                   of shape (768,)
2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight                 loaded from bert.encoder.blocks.1.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias                    loaded from bert.encoder.blocks.1.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight                  loaded from bert.encoder.blocks.1.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias                     loaded from bert.encoder.blocks.1.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight                   loaded from bert.encoder.blocks.1.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias                     loaded from bert.encoder.blocks.1.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight                   loaded from bert.encoder.blocks.1.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias                       loaded from bert.encoder.blocks.1.norm1.bias                       of shape (768,)
2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight                     loaded from bert.encoder.blocks.1.norm1.weight                     of shape (768,)
2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias                       loaded from bert.encoder.blocks.1.norm2.bias                       of shape (768,)
2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight                     loaded from bert.encoder.blocks.1.norm2.weight                     of shape (768,)
2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias                  loaded from bert.encoder.blocks.10.attn.proj.bias                  of shape (768,)
2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight                loaded from bert.encoder.blocks.10.attn.proj.weight                of shape (768, 768)
2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias                   loaded from bert.encoder.blocks.10.attn.qkv.bias                   of shape (2304,)
2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight                 loaded from bert.encoder.blocks.10.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias                    loaded from bert.encoder.blocks.10.mlp.fc1.bias                    of shape (3072,)
2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight                  loaded from bert.encoder.blocks.10.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias                    loaded from bert.encoder.blocks.10.mlp.fc2.bias                    of shape (768,)
2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight                  loaded from bert.encoder.blocks.10.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias                      loaded from bert.encoder.blocks.10.norm1.bias                      of shape (768,)
2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight                    loaded from bert.encoder.blocks.10.norm1.weight                    of shape (768,)
2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias                      loaded from bert.encoder.blocks.10.norm2.bias                      of shape (768,)
2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight                    loaded from bert.encoder.blocks.10.norm2.weight                    of shape (768,)
2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias                  loaded from bert.encoder.blocks.11.attn.proj.bias                  of shape (768,)
2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight                loaded from bert.encoder.blocks.11.attn.proj.weight                of shape (768, 768)
2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias                   loaded from bert.encoder.blocks.11.attn.qkv.bias                   of shape (2304,)
2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight                 loaded from bert.encoder.blocks.11.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias                    loaded from bert.encoder.blocks.11.mlp.fc1.bias                    of shape (3072,)
2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight                  loaded from bert.encoder.blocks.11.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias                    loaded from bert.encoder.blocks.11.mlp.fc2.bias                    of shape (768,)
2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight                  loaded from bert.encoder.blocks.11.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias                      loaded from bert.encoder.blocks.11.norm1.bias                      of shape (768,)
2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight                    loaded from bert.encoder.blocks.11.norm1.weight                    of shape (768,)
2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias                      loaded from bert.encoder.blocks.11.norm2.bias                      of shape (768,)
2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight                    loaded from bert.encoder.blocks.11.norm2.weight                    of shape (768,)
2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias                   loaded from bert.encoder.blocks.2.attn.proj.bias                   of shape (768,)
2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight                 loaded from bert.encoder.blocks.2.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias                    loaded from bert.encoder.blocks.2.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight                  loaded from bert.encoder.blocks.2.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias                     loaded from bert.encoder.blocks.2.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight                   loaded from bert.encoder.blocks.2.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias                     loaded from bert.encoder.blocks.2.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight                   loaded from bert.encoder.blocks.2.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias                       loaded from bert.encoder.blocks.2.norm1.bias                       of shape (768,)
2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight                     loaded from bert.encoder.blocks.2.norm1.weight                     of shape (768,)
2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias                       loaded from bert.encoder.blocks.2.norm2.bias                       of shape (768,)
2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight                     loaded from bert.encoder.blocks.2.norm2.weight                     of shape (768,)
2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias                   loaded from bert.encoder.blocks.3.attn.proj.bias                   of shape (768,)
2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight                 loaded from bert.encoder.blocks.3.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias                    loaded from bert.encoder.blocks.3.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight                  loaded from bert.encoder.blocks.3.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias                     loaded from bert.encoder.blocks.3.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight                   loaded from bert.encoder.blocks.3.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias                     loaded from bert.encoder.blocks.3.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight                   loaded from bert.encoder.blocks.3.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias                       loaded from bert.encoder.blocks.3.norm1.bias                       of shape (768,)
2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight                     loaded from bert.encoder.blocks.3.norm1.weight                     of shape (768,)
2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias                       loaded from bert.encoder.blocks.3.norm2.bias                       of shape (768,)
2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight                     loaded from bert.encoder.blocks.3.norm2.weight                     of shape (768,)
2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias                   loaded from bert.encoder.blocks.4.attn.proj.bias                   of shape (768,)
2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight                 loaded from bert.encoder.blocks.4.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias                    loaded from bert.encoder.blocks.4.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight                  loaded from bert.encoder.blocks.4.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias                     loaded from bert.encoder.blocks.4.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight                   loaded from bert.encoder.blocks.4.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias                     loaded from bert.encoder.blocks.4.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight                   loaded from bert.encoder.blocks.4.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias                       loaded from bert.encoder.blocks.4.norm1.bias                       of shape (768,)
2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight                     loaded from bert.encoder.blocks.4.norm1.weight                     of shape (768,)
2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias                       loaded from bert.encoder.blocks.4.norm2.bias                       of shape (768,)
2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight                     loaded from bert.encoder.blocks.4.norm2.weight                     of shape (768,)
2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias                   loaded from bert.encoder.blocks.5.attn.proj.bias                   of shape (768,)
2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight                 loaded from bert.encoder.blocks.5.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias                    loaded from bert.encoder.blocks.5.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight                  loaded from bert.encoder.blocks.5.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias                     loaded from bert.encoder.blocks.5.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight                   loaded from bert.encoder.blocks.5.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias                     loaded from bert.encoder.blocks.5.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight                   loaded from bert.encoder.blocks.5.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias                       loaded from bert.encoder.blocks.5.norm1.bias                       of shape (768,)
2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight                     loaded from bert.encoder.blocks.5.norm1.weight                     of shape (768,)
2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias                       loaded from bert.encoder.blocks.5.norm2.bias                       of shape (768,)
2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight                     loaded from bert.encoder.blocks.5.norm2.weight                     of shape (768,)
2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias                   loaded from bert.encoder.blocks.6.attn.proj.bias                   of shape (768,)
2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight                 loaded from bert.encoder.blocks.6.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias                    loaded from bert.encoder.blocks.6.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight                  loaded from bert.encoder.blocks.6.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias                     loaded from bert.encoder.blocks.6.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight                   loaded from bert.encoder.blocks.6.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias                     loaded from bert.encoder.blocks.6.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight                   loaded from bert.encoder.blocks.6.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias                       loaded from bert.encoder.blocks.6.norm1.bias                       of shape (768,)
2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight                     loaded from bert.encoder.blocks.6.norm1.weight                     of shape (768,)
2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias                       loaded from bert.encoder.blocks.6.norm2.bias                       of shape (768,)
2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight                     loaded from bert.encoder.blocks.6.norm2.weight                     of shape (768,)
2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias                   loaded from bert.encoder.blocks.7.attn.proj.bias                   of shape (768,)
2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight                 loaded from bert.encoder.blocks.7.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias                    loaded from bert.encoder.blocks.7.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight                  loaded from bert.encoder.blocks.7.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias                     loaded from bert.encoder.blocks.7.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight                   loaded from bert.encoder.blocks.7.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias                     loaded from bert.encoder.blocks.7.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight                   loaded from bert.encoder.blocks.7.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias                       loaded from bert.encoder.blocks.7.norm1.bias                       of shape (768,)
2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight                     loaded from bert.encoder.blocks.7.norm1.weight                     of shape (768,)
2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias                       loaded from bert.encoder.blocks.7.norm2.bias                       of shape (768,)
2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight                     loaded from bert.encoder.blocks.7.norm2.weight                     of shape (768,)
2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias                   loaded from bert.encoder.blocks.8.attn.proj.bias                   of shape (768,)
2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight                 loaded from bert.encoder.blocks.8.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias                    loaded from bert.encoder.blocks.8.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight                  loaded from bert.encoder.blocks.8.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias                     loaded from bert.encoder.blocks.8.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight                   loaded from bert.encoder.blocks.8.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias                     loaded from bert.encoder.blocks.8.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight                   loaded from bert.encoder.blocks.8.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias                       loaded from bert.encoder.blocks.8.norm1.bias                       of shape (768,)
2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight                     loaded from bert.encoder.blocks.8.norm1.weight                     of shape (768,)
2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias                       loaded from bert.encoder.blocks.8.norm2.bias                       of shape (768,)
2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight                     loaded from bert.encoder.blocks.8.norm2.weight                     of shape (768,)
2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias                   loaded from bert.encoder.blocks.9.attn.proj.bias                   of shape (768,)
2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight                 loaded from bert.encoder.blocks.9.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias                    loaded from bert.encoder.blocks.9.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight                  loaded from bert.encoder.blocks.9.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias                     loaded from bert.encoder.blocks.9.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight                   loaded from bert.encoder.blocks.9.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias                     loaded from bert.encoder.blocks.9.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight                   loaded from bert.encoder.blocks.9.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias                       loaded from bert.encoder.blocks.9.norm1.bias                       of shape (768,)
2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight                     loaded from bert.encoder.blocks.9.norm1.weight                     of shape (768,)
2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias                       loaded from bert.encoder.blocks.9.norm2.bias                       of shape (768,)
2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight                     loaded from bert.encoder.blocks.9.norm2.weight                     of shape (768,)
2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias               loaded from bert.encoder.tag_blocks.0.attn.proj.bias               of shape (768,)
2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight             loaded from bert.encoder.tag_blocks.0.attn.proj.weight             of shape (768, 768)
2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias                loaded from bert.encoder.tag_blocks.0.attn.qkv.bias                of shape (2304,)
2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight              loaded from bert.encoder.tag_blocks.0.attn.qkv.weight              of shape (2304, 768)
2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias                 of shape (3072,)
2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias                 of shape (768,)
2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias                   loaded from bert.encoder.tag_blocks.0.norm1.bias                   of shape (768,)
2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight                 loaded from bert.encoder.tag_blocks.0.norm1.weight                 of shape (768,)
2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias                   loaded from bert.encoder.tag_blocks.0.norm2.bias                   of shape (768,)
2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight                 loaded from bert.encoder.tag_blocks.0.norm2.weight                 of shape (768,)
2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias               loaded from bert.encoder.tag_blocks.1.attn.proj.bias               of shape (768,)
2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight             loaded from bert.encoder.tag_blocks.1.attn.proj.weight             of shape (768, 768)
2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias                loaded from bert.encoder.tag_blocks.1.attn.qkv.bias                of shape (2304,)
2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight              loaded from bert.encoder.tag_blocks.1.attn.qkv.weight              of shape (2304, 768)
2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias                 of shape (3072,)
2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias                 of shape (768,)
2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias                   loaded from bert.encoder.tag_blocks.1.norm1.bias                   of shape (768,)
2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight                 loaded from bert.encoder.tag_blocks.1.norm1.weight                 of shape (768,)
2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias                   loaded from bert.encoder.tag_blocks.1.norm2.bias                   of shape (768,)
2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight                 loaded from bert.encoder.tag_blocks.1.norm2.weight                 of shape (768,)
2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias               loaded from bert.encoder.tag_blocks.2.attn.proj.bias               of shape (768,)
2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight             loaded from bert.encoder.tag_blocks.2.attn.proj.weight             of shape (768, 768)
2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias                loaded from bert.encoder.tag_blocks.2.attn.qkv.bias                of shape (2304,)
2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight              loaded from bert.encoder.tag_blocks.2.attn.qkv.weight              of shape (2304, 768)
2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias                 of shape (3072,)
2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias                 of shape (768,)
2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias                   loaded from bert.encoder.tag_blocks.2.norm1.bias                   of shape (768,)
2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight                 loaded from bert.encoder.tag_blocks.2.norm1.weight                 of shape (768,)
2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias                   loaded from bert.encoder.tag_blocks.2.norm2.bias                   of shape (768,)
2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight                 loaded from bert.encoder.tag_blocks.2.norm2.weight                 of shape (768,)
2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias               loaded from bert.encoder.tag_blocks.3.attn.proj.bias               of shape (768,)
2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight             loaded from bert.encoder.tag_blocks.3.attn.proj.weight             of shape (768, 768)
2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias                loaded from bert.encoder.tag_blocks.3.attn.qkv.bias                of shape (2304,)
2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight              loaded from bert.encoder.tag_blocks.3.attn.qkv.weight              of shape (2304, 768)
2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias                 of shape (3072,)
2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias                 of shape (768,)
2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias                   loaded from bert.encoder.tag_blocks.3.norm1.bias                   of shape (768,)
2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight                 loaded from bert.encoder.tag_blocks.3.norm1.weight                 of shape (768,)
2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias                   loaded from bert.encoder.tag_blocks.3.norm2.bias                   of shape (768,)
2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight                 loaded from bert.encoder.tag_blocks.3.norm2.weight                 of shape (768,)
2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias                   loaded from bert.extra_embeddings.LayerNorm.bias                   of shape (768,)
2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight                 loaded from bert.extra_embeddings.LayerNorm.weight                 of shape (768,)
2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight       loaded from bert.extra_embeddings.position_embeddings.weight       of shape (512, 768)
2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight     loaded from bert.extra_embeddings.token_type_embeddings.weight     of shape (2, 768)
2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight           loaded from bert.extra_embeddings.word_embeddings.weight           of shape (30522, 768)
2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias                                 loaded from bert.pooler.dense.bias                                 of shape (768,)
2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight                               loaded from bert.pooler.dense.weight                               of shape (768, 768)
2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias                        loaded from bert.tag_logit.predictions.bias                        of shape (30522,)
2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight              loaded from bert.tag_logit.predictions.decoder.weight              of shape (30522, 768)
2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias    loaded from bert.tag_logit.predictions.transform.LayerNorm.bias    of shape (768,)
2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight  loaded from bert.tag_logit.predictions.transform.LayerNorm.weight  of shape (768,)
2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias        loaded from bert.tag_logit.predictions.transform.dense.bias        of shape (768,)
2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight      loaded from bert.tag_logit.predictions.transform.dense.weight      of shape (768, 768)
2022-03-17 14:02:23,653.653 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias                                   loaded from cls.predictions.bias                                   of shape (30522,)
2022-03-17 14:02:23,653.653 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight                         loaded from cls.predictions.decoder.weight                         of shape (30522, 768)
2022-03-17 14:02:23,653.653 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias               loaded from cls.predictions.transform.LayerNorm.bias               of shape (768,)
2022-03-17 14:02:23,653.653 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight             loaded from cls.predictions.transform.LayerNorm.weight             of shape (768,)
2022-03-17 14:02:23,653.653 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias                   loaded from cls.predictions.transform.dense.bias                   of shape (768,)
2022-03-17 14:02:23,653.653 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight                 loaded from cls.predictions.transform.dense.weight                 of shape (768, 768)
2022-03-17 14:02:23,653.653 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288
2022-03-17 14:02:23,653.653 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = []
2022-03-17 14:02:23,656.656 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1
2022-03-17 14:02:23,828.828 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight']
2022-03-17 14:02:24,163.163 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = <src.qd.data_layer.transform.LoadImage object at 0x7fcc847fbb10>; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False
2022-03-17 14:02:24,189.189 2829:uni_pipeline.py:509 get_data_loader(): sampler = <src.qd.data_layer.samplers.DistributedSampler object at 0x7fcc853d6690>
2022-03-17 14:02:24,189.189 2829:uni_pipeline.py:1000    predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 14:02:24,262.262 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=<src.qd.data_layer.dataset.ImageIdxTSVDataset object at 0x7fcc847e5b10>, transform=Compose(
    <src.qd.data_layer.transform.LoadHW object at 0x7fcc847fba10>
    Compose(
    <src.qd.data_layer.transform.LoadImage object at 0x7fcc847fbb10>
    ImageTransform2Dict(image_transform=Compose(
    <src.qd.data_layer.transform.BGR2RGB object at 0x7fcc8467c0d0>
    ToPILImage()
    Resize(size=384, interpolation=PIL.Image.BICUBIC)
    CenterCrop(size=(384, 384))
    ToTensor()
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
))
)
    LoadLabel(data=TaxCocoCaption, split=test, version=vinvl)
    <src.qd.data_layer.transform.IdentifyTextAB object at 0x7fcc8467c3d0>
    TransCaptionTensorizer(tensorizer=<src.qd.mask.data.datasets.caption_tensorizer.CaptionTensorizer object at 0x7fcbcd9b5710>, pad_to_max=True, pad_image_to_max=True)
    <src.qd.pipelines.tagger_caption_uni_pipeline_expanding.Tensorizer object at 0x7fcc853d67d0>
    <src.qd.data_layer.transform.RemoveUselessKeys object at 0x7fcc853d6790>
    <src.qd.data_layer.transform.RenameKey object at 0x7fcc853d6750>
))

uni_pipeline.py:908:   0%|          | 0/10 [00:00<?, ?it/s]/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1639: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

uni_pipeline.py:908:  10%|█         | 1/10 [00:05<00:46,  5.17s/it]
uni_pipeline.py:908:  20%|██        | 2/10 [00:10<00:42,  5.36s/it]
uni_pipeline.py:908:  30%|███       | 3/10 [00:16<00:38,  5.43s/it]
uni_pipeline.py:908:  40%|████      | 4/10 [00:22<00:33,  5.60s/it]
uni_pipeline.py:908:  50%|█████     | 5/10 [00:26<00:25,  5.20s/it]
uni_pipeline.py:908:  60%|██████    | 6/10 [00:31<00:20,  5.20s/it]
uni_pipeline.py:908:  70%|███████   | 7/10 [00:36<00:15,  5.19s/it]
uni_pipeline.py:908:  80%|████████  | 8/10 [00:41<00:10,  5.08s/it]
uni_pipeline.py:908:  90%|█████████ | 9/10 [00:46<00:05,  5.11s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:51<00:00,  5.00s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:52<00:00,  5.26s/it]
2022-03-17 14:03:17,736.736 2829:qd_common.py:625    cmd_run(): start to cmd run: nvidia-smi
2022-03-17 14:03:19,516.516 2829:qd_common.py:670    cmd_run(): finished the cmd run
2022-03-17 14:03:20,015.015 2829:uni_pipeline.py:1008    predict(): data: 0.0003 (0.0544)  input_to_cuda: 0.0029 (0.0030)  model: 5.1698 (5.1237)  write: 0.0045 (0.0045)
2022-03-17 14:03:35,932.932 2829:qd_common.py:1157 concat_files(): concating 0/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 14:03:35,933.933 2829:qd_common.py:1157 concat_files(): concating 1/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_1_32.tsv
2022-03-17 14:03:35,934.934 2829:qd_common.py:1157 concat_files(): concating 2/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_2_32.tsv
2022-03-17 14:03:35,934.934 2829:qd_common.py:1157 concat_files(): concating 3/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_3_32.tsv
2022-03-17 14:03:35,935.935 2829:qd_common.py:1157 concat_files(): concating 4/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_4_32.tsv
2022-03-17 14:03:35,935.935 2829:qd_common.py:1157 concat_files(): concating 5/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_5_32.tsv
2022-03-17 14:03:35,936.936 2829:qd_common.py:1157 concat_files(): concating 6/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_6_32.tsv
2022-03-17 14:03:35,936.936 2829:qd_common.py:1157 concat_files(): concating 7/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_7_32.tsv
2022-03-17 14:03:35,937.937 2829:qd_common.py:1157 concat_files(): concating 8/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_8_32.tsv
2022-03-17 14:03:36,009.009 2829:qd_common.py:1157 concat_files(): concating 9/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_9_32.tsv
2022-03-17 14:03:36,081.081 2829:qd_common.py:1157 concat_files(): concating 10/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_10_32.tsv
2022-03-17 14:03:36,152.152 2829:qd_common.py:1157 concat_files(): concating 11/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_11_32.tsv
2022-03-17 14:03:36,225.225 2829:qd_common.py:1157 concat_files(): concating 12/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_12_32.tsv
2022-03-17 14:03:36,296.296 2829:qd_common.py:1157 concat_files(): concating 13/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_13_32.tsv
2022-03-17 14:03:36,370.370 2829:qd_common.py:1157 concat_files(): concating 14/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_14_32.tsv
2022-03-17 14:03:36,440.440 2829:qd_common.py:1157 concat_files(): concating 15/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_15_32.tsv
2022-03-17 14:03:36,511.511 2829:qd_common.py:1157 concat_files(): concating 16/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_16_32.tsv
2022-03-17 14:03:36,582.582 2829:qd_common.py:1157 concat_files(): concating 17/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_17_32.tsv
2022-03-17 14:03:36,652.652 2829:qd_common.py:1157 concat_files(): concating 18/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_18_32.tsv
2022-03-17 14:03:36,725.725 2829:qd_common.py:1157 concat_files(): concating 19/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_19_32.tsv
2022-03-17 14:03:36,796.796 2829:qd_common.py:1157 concat_files(): concating 20/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_20_32.tsv
2022-03-17 14:03:36,867.867 2829:qd_common.py:1157 concat_files(): concating 21/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_21_32.tsv
2022-03-17 14:03:36,939.939 2829:qd_common.py:1157 concat_files(): concating 22/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_22_32.tsv
2022-03-17 14:03:37,010.010 2829:qd_common.py:1157 concat_files(): concating 23/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_23_32.tsv
2022-03-17 14:03:37,081.081 2829:qd_common.py:1157 concat_files(): concating 24/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_24_32.tsv
2022-03-17 14:03:37,151.151 2829:qd_common.py:1157 concat_files(): concating 25/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_25_32.tsv
2022-03-17 14:03:37,232.232 2829:qd_common.py:1157 concat_files(): concating 26/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_26_32.tsv
2022-03-17 14:03:37,302.302 2829:qd_common.py:1157 concat_files(): concating 27/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_27_32.tsv
2022-03-17 14:03:37,373.373 2829:qd_common.py:1157 concat_files(): concating 28/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_28_32.tsv
2022-03-17 14:03:37,443.443 2829:qd_common.py:1157 concat_files(): concating 29/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_29_32.tsv
2022-03-17 14:03:37,514.514 2829:qd_common.py:1157 concat_files(): concating 30/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_30_32.tsv
2022-03-17 14:03:37,585.585 2829:qd_common.py:1157 concat_files(): concating 31/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_31_32.tsv
2022-03-17 14:03:39,871.871 2829:tsv_io.py:66 reorder_tsv_keys(): loading keys in input
2022-03-17 14:03:39,898.898 2829:tsv_io.py:328 _ensure_lineidx_loaded(): loaded 5024 from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv.before.reorder.lineidx

tsv_io.py:67:   0%|          | 0/5024 [00:00<?, ?it/s]
tsv_io.py:67: 100%|██████████| 5024/5024 [00:00<00:00, 100818.05it/s]
2022-03-17 14:03:40,021.021 2829:tsv_io.py:70   gen_rows(): writing

tsv_io.py:71:   0%|          | 0/5000 [00:00<?, ?it/s]
tsv_io.py:71: 100%|██████████| 5000/5000 [00:00<00:00, 71924.22it/s]
2022-03-17 14:03:57,196.196 2829:qd_common.py:625    cmd_run(): start to cmd run: rsync output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv.speed.vis.txt.tmp --progress
model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt

         32,768  33%    0.00kB/s    0:00:00  
         98,002 100%   62.21MB/s    0:00:00 (xfr#1, to-chk=0/1)
loading annotations into memory...
0:00:00.022208
creating index...
index created!
Loading and preparing results...     
DONE (t=0.01s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 307086 tokens at 1413581.35 tokens per second.
PTBTokenizer tokenized 59246 tokens at 500360.34 tokens per second.
setting up scorers...
computing Bleu score...
{'testlen': 49349, 'reflen': 48555, 'guess': [49349, 44349, 39349, 34349], 'correct': [38139, 21404, 10742, 5205]}
ratio: 1.0163525898465449
Bleu_1: 0.773
Bleu_2: 0.611
Bleu_3: 0.467
Bleu_4: 0.352
computing METEOR score...
METEOR: 0.285
computing Rouge score...
ROUGE_L: 0.574
computing CIDEr score...
CIDEr: 1.204
computing SPICE score...
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.nustaq.serialization.FSTClazzInfo (file:/mnt/batch/tasks/shared/LS_root/jobs/vlp32gb/azureml/zhiyuan-pytorch-test_1647405376_841ce8bd/mounts/vig_data/jianfw/data/qd_data/coco_caption/pycocoevalcap/spice/lib/fst-2.47.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of org.nustaq.serialization.FSTClazzInfo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Parsing reference captions
Parsing test captions
Initiating Stanford parsing pipeline
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... 
done [0.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [0.9 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.4 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].
Threads( StanfordCoreNLP ) [3.167 seconds]
Warning: Nashorn engine is planned to be removed from a future JDK release
SPICE evaluation took: 22.06 s
SPICE: 0.218
2022-03-17 14:04:41,866.866 2829:tagger_caption_uni_pipeline_expanding.py:1010   evaluate(): evaluation result: {'Bleu_1': 0.7728424081541516, 'Bleu_2': 0.6107324949151149, 'Bleu_3': 0.4669651836601358, 'Bleu_4': 0.35244392972702643, 'METEOR': 0.28513564701540534, 'ROUGE_L': 0.5740026002226132, 'CIDEr': 1.2043477051457065, 'SPICE': 0.21834145810874342}
2022-03-17 14:04:41,866.866 2829:tagger_caption_uni_pipeline_expanding.py:1011   evaluate(): evaluation result saved to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.report
2022-03-17 14:04:41,892.892 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json
2022-03-17 14:04:41,892.892 2829:modeling_utils.py:211 from_pretrained(): Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "image_captioning",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "TIMM_vit",
  "net": "vit_base_patch16_384",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "pretrained": true,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format
    record.message = record.getMessage()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__
    return str(self.to_json_string())
  File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string
    return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
  File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BertTokenizer is not JSON serializable
Call stack:
  File "src/qd/pipeline.py", line 1368, in <module>
    locals()[function_name](**kwargs)
  File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi
    pip.monitor_train()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train
    need_wait_models = self.pred_eval_intermediate_models()
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models
    pred = self.ensure_predict(model_file=model_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict
    self.predict(model_file, predict_result_file)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict
    model = self.get_model(is_train=False)
  File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model
    model = self.get_raw_model(is_train)
  File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model
    model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__
    self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__
    self.encoder = TIMMVitSplitEncoder(config)
  File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__
    logging.info(config)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
2022-03-17 14:04:42,578.578 2829:modeling_bert.py:529   __init__(): TIMM Split image encoder load from pre-trained: True
2022-03-17 14:04:43,646.646 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 14:04:45,196.196 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 14:04:45,849.849 2829:modeling_bert.py:2677   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 14:04:45,998.998 2829:modeling_bert.py:2688   __init__(): BertImgModel Image Dimension: 2054
2022-03-17 14:04:46,788.788 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight!  Note that this might be replaced by pre-trained checkpoint later!
2022-03-17 14:04:46,789.789 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode.
2022-03-17 14:04:47,921.921 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth)
helpers.py  <All keys matched successfully>
2022-03-17 14:04:48,456.456 2829:checkpoint.py:240       load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt
2022-03-17 14:04:57,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token                                loaded from image_encoder.module.cls_token                         of shape (1, 1, 768)
2022-03-17 14:04:57,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias                                loaded from image_encoder.module.head.bias                         of shape (1000,)
2022-03-17 14:04:57,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight                              loaded from image_encoder.module.head.weight                       of shape (1000, 768)
2022-03-17 14:04:57,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias                    loaded from image_encoder.module.patch_embed.proj.bias             of shape (768,)
2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight                  loaded from image_encoder.module.patch_embed.proj.weight           of shape (768, 3, 16, 16)
2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed                                loaded from image_encoder.module.pos_embed                         of shape (1, 577, 768)
2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias                         loaded from bert.caption_pooler.dense.bias                         of shape (768,)
2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight                       loaded from bert.caption_pooler.dense.weight                       of shape (768, 768)
2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias       loaded from bert.decoder.layer.0.attention.output.dense.bias       of shape (768,)
2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight     loaded from bert.decoder.layer.0.attention.output.dense.weight     of shape (768, 768)
2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias           loaded from bert.decoder.layer.0.attention.self.key.bias           of shape (768,)
2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight         loaded from bert.decoder.layer.0.attention.self.key.weight         of shape (768, 768)
2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias         loaded from bert.decoder.layer.0.attention.self.query.bias         of shape (768,)
2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight       loaded from bert.decoder.layer.0.attention.self.query.weight       of shape (768, 768)
2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias         loaded from bert.decoder.layer.0.attention.self.value.bias         of shape (768,)
2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight       loaded from bert.decoder.layer.0.attention.self.value.weight       of shape (768, 768)
2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias           loaded from bert.decoder.layer.0.intermediate.dense.bias           of shape (3072,)
2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight         loaded from bert.decoder.layer.0.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias             loaded from bert.decoder.layer.0.output.LayerNorm.bias             of shape (768,)
2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight           loaded from bert.decoder.layer.0.output.LayerNorm.weight           of shape (768,)
2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias                 loaded from bert.decoder.layer.0.output.dense.bias                 of shape (768,)
2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight               loaded from bert.decoder.layer.0.output.dense.weight               of shape (768, 3072)
2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias       loaded from bert.decoder.layer.1.attention.output.dense.bias       of shape (768,)
2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight     loaded from bert.decoder.layer.1.attention.output.dense.weight     of shape (768, 768)
2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias           loaded from bert.decoder.layer.1.attention.self.key.bias           of shape (768,)
2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight         loaded from bert.decoder.layer.1.attention.self.key.weight         of shape (768, 768)
2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias         loaded from bert.decoder.layer.1.attention.self.query.bias         of shape (768,)
2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight       loaded from bert.decoder.layer.1.attention.self.query.weight       of shape (768, 768)
2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias         loaded from bert.decoder.layer.1.attention.self.value.bias         of shape (768,)
2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight       loaded from bert.decoder.layer.1.attention.self.value.weight       of shape (768, 768)
2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias           loaded from bert.decoder.layer.1.intermediate.dense.bias           of shape (3072,)
2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight         loaded from bert.decoder.layer.1.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias             loaded from bert.decoder.layer.1.output.LayerNorm.bias             of shape (768,)
2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight           loaded from bert.decoder.layer.1.output.LayerNorm.weight           of shape (768,)
2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias                 loaded from bert.decoder.layer.1.output.dense.bias                 of shape (768,)
2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight               loaded from bert.decoder.layer.1.output.dense.weight               of shape (768, 3072)
2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias       loaded from bert.decoder.layer.2.attention.output.dense.bias       of shape (768,)
2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight     loaded from bert.decoder.layer.2.attention.output.dense.weight     of shape (768, 768)
2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias           loaded from bert.decoder.layer.2.attention.self.key.bias           of shape (768,)
2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight         loaded from bert.decoder.layer.2.attention.self.key.weight         of shape (768, 768)
2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias         loaded from bert.decoder.layer.2.attention.self.query.bias         of shape (768,)
2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight       loaded from bert.decoder.layer.2.attention.self.query.weight       of shape (768, 768)
2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias         loaded from bert.decoder.layer.2.attention.self.value.bias         of shape (768,)
2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight       loaded from bert.decoder.layer.2.attention.self.value.weight       of shape (768, 768)
2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias           loaded from bert.decoder.layer.2.intermediate.dense.bias           of shape (3072,)
2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight         loaded from bert.decoder.layer.2.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias             loaded from bert.decoder.layer.2.output.LayerNorm.bias             of shape (768,)
2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight           loaded from bert.decoder.layer.2.output.LayerNorm.weight           of shape (768,)
2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias                 loaded from bert.decoder.layer.2.output.dense.bias                 of shape (768,)
2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight               loaded from bert.decoder.layer.2.output.dense.weight               of shape (768, 3072)
2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias   loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias   of shape (768,)
2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,)
2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias       loaded from bert.decoder.layer.3.attention.output.dense.bias       of shape (768,)
2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight     loaded from bert.decoder.layer.3.attention.output.dense.weight     of shape (768, 768)
2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias           loaded from bert.decoder.layer.3.attention.self.key.bias           of shape (768,)
2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight         loaded from bert.decoder.layer.3.attention.self.key.weight         of shape (768, 768)
2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias         loaded from bert.decoder.layer.3.attention.self.query.bias         of shape (768,)
2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight       loaded from bert.decoder.layer.3.attention.self.query.weight       of shape (768, 768)
2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias         loaded from bert.decoder.layer.3.attention.self.value.bias         of shape (768,)
2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight       loaded from bert.decoder.layer.3.attention.self.value.weight       of shape (768, 768)
2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias           loaded from bert.decoder.layer.3.intermediate.dense.bias           of shape (3072,)
2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight         loaded from bert.decoder.layer.3.intermediate.dense.weight         of shape (3072, 768)
2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias             loaded from bert.decoder.layer.3.output.LayerNorm.bias             of shape (768,)
2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight           loaded from bert.decoder.layer.3.output.LayerNorm.weight           of shape (768,)
2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias                 loaded from bert.decoder.layer.3.output.dense.bias                 of shape (768,)
2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight               loaded from bert.decoder.layer.3.output.dense.weight               of shape (768, 3072)
2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias                         loaded from bert.embeddings.LayerNorm.bias                         of shape (768,)
2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight                       loaded from bert.embeddings.LayerNorm.weight                       of shape (768,)
2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight             loaded from bert.embeddings.position_embeddings.weight             of shape (512, 768)
2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight           loaded from bert.embeddings.token_type_embeddings.weight           of shape (2, 768)
2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight                 loaded from bert.embeddings.word_embeddings.weight                 of shape (30522, 768)
2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias                   loaded from bert.encoder.blocks.0.attn.proj.bias                   of shape (768,)
2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight                 loaded from bert.encoder.blocks.0.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias                    loaded from bert.encoder.blocks.0.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight                  loaded from bert.encoder.blocks.0.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias                     loaded from bert.encoder.blocks.0.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight                   loaded from bert.encoder.blocks.0.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias                     loaded from bert.encoder.blocks.0.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight                   loaded from bert.encoder.blocks.0.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias                       loaded from bert.encoder.blocks.0.norm1.bias                       of shape (768,)
2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight                     loaded from bert.encoder.blocks.0.norm1.weight                     of shape (768,)
2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias                       loaded from bert.encoder.blocks.0.norm2.bias                       of shape (768,)
2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight                     loaded from bert.encoder.blocks.0.norm2.weight                     of shape (768,)
2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias                   loaded from bert.encoder.blocks.1.attn.proj.bias                   of shape (768,)
2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight                 loaded from bert.encoder.blocks.1.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias                    loaded from bert.encoder.blocks.1.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight                  loaded from bert.encoder.blocks.1.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias                     loaded from bert.encoder.blocks.1.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight                   loaded from bert.encoder.blocks.1.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias                     loaded from bert.encoder.blocks.1.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight                   loaded from bert.encoder.blocks.1.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias                       loaded from bert.encoder.blocks.1.norm1.bias                       of shape (768,)
2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight                     loaded from bert.encoder.blocks.1.norm1.weight                     of shape (768,)
2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias                       loaded from bert.encoder.blocks.1.norm2.bias                       of shape (768,)
2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight                     loaded from bert.encoder.blocks.1.norm2.weight                     of shape (768,)
2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias                  loaded from bert.encoder.blocks.10.attn.proj.bias                  of shape (768,)
2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight                loaded from bert.encoder.blocks.10.attn.proj.weight                of shape (768, 768)
2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias                   loaded from bert.encoder.blocks.10.attn.qkv.bias                   of shape (2304,)
2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight                 loaded from bert.encoder.blocks.10.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias                    loaded from bert.encoder.blocks.10.mlp.fc1.bias                    of shape (3072,)
2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight                  loaded from bert.encoder.blocks.10.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias                    loaded from bert.encoder.blocks.10.mlp.fc2.bias                    of shape (768,)
2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight                  loaded from bert.encoder.blocks.10.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias                      loaded from bert.encoder.blocks.10.norm1.bias                      of shape (768,)
2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight                    loaded from bert.encoder.blocks.10.norm1.weight                    of shape (768,)
2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias                      loaded from bert.encoder.blocks.10.norm2.bias                      of shape (768,)
2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight                    loaded from bert.encoder.blocks.10.norm2.weight                    of shape (768,)
2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias                  loaded from bert.encoder.blocks.11.attn.proj.bias                  of shape (768,)
2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight                loaded from bert.encoder.blocks.11.attn.proj.weight                of shape (768, 768)
2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias                   loaded from bert.encoder.blocks.11.attn.qkv.bias                   of shape (2304,)
2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight                 loaded from bert.encoder.blocks.11.attn.qkv.weight                 of shape (2304, 768)
2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias                    loaded from bert.encoder.blocks.11.mlp.fc1.bias                    of shape (3072,)
2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight                  loaded from bert.encoder.blocks.11.mlp.fc1.weight                  of shape (3072, 768)
2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias                    loaded from bert.encoder.blocks.11.mlp.fc2.bias                    of shape (768,)
2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight                  loaded from bert.encoder.blocks.11.mlp.fc2.weight                  of shape (768, 3072)
2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias                      loaded from bert.encoder.blocks.11.norm1.bias                      of shape (768,)
2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight                    loaded from bert.encoder.blocks.11.norm1.weight                    of shape (768,)
2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias                      loaded from bert.encoder.blocks.11.norm2.bias                      of shape (768,)
2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight                    loaded from bert.encoder.blocks.11.norm2.weight                    of shape (768,)
2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias                   loaded from bert.encoder.blocks.2.attn.proj.bias                   of shape (768,)
2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight                 loaded from bert.encoder.blocks.2.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias                    loaded from bert.encoder.blocks.2.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight                  loaded from bert.encoder.blocks.2.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias                     loaded from bert.encoder.blocks.2.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight                   loaded from bert.encoder.blocks.2.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias                     loaded from bert.encoder.blocks.2.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight                   loaded from bert.encoder.blocks.2.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias                       loaded from bert.encoder.blocks.2.norm1.bias                       of shape (768,)
2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight                     loaded from bert.encoder.blocks.2.norm1.weight                     of shape (768,)
2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias                       loaded from bert.encoder.blocks.2.norm2.bias                       of shape (768,)
2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight                     loaded from bert.encoder.blocks.2.norm2.weight                     of shape (768,)
2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias                   loaded from bert.encoder.blocks.3.attn.proj.bias                   of shape (768,)
2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight                 loaded from bert.encoder.blocks.3.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias                    loaded from bert.encoder.blocks.3.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight                  loaded from bert.encoder.blocks.3.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias                     loaded from bert.encoder.blocks.3.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight                   loaded from bert.encoder.blocks.3.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias                     loaded from bert.encoder.blocks.3.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight                   loaded from bert.encoder.blocks.3.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias                       loaded from bert.encoder.blocks.3.norm1.bias                       of shape (768,)
2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight                     loaded from bert.encoder.blocks.3.norm1.weight                     of shape (768,)
2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias                       loaded from bert.encoder.blocks.3.norm2.bias                       of shape (768,)
2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight                     loaded from bert.encoder.blocks.3.norm2.weight                     of shape (768,)
2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias                   loaded from bert.encoder.blocks.4.attn.proj.bias                   of shape (768,)
2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight                 loaded from bert.encoder.blocks.4.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias                    loaded from bert.encoder.blocks.4.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight                  loaded from bert.encoder.blocks.4.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias                     loaded from bert.encoder.blocks.4.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight                   loaded from bert.encoder.blocks.4.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias                     loaded from bert.encoder.blocks.4.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight                   loaded from bert.encoder.blocks.4.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias                       loaded from bert.encoder.blocks.4.norm1.bias                       of shape (768,)
2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight                     loaded from bert.encoder.blocks.4.norm1.weight                     of shape (768,)
2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias                       loaded from bert.encoder.blocks.4.norm2.bias                       of shape (768,)
2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight                     loaded from bert.encoder.blocks.4.norm2.weight                     of shape (768,)
2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias                   loaded from bert.encoder.blocks.5.attn.proj.bias                   of shape (768,)
2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight                 loaded from bert.encoder.blocks.5.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias                    loaded from bert.encoder.blocks.5.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight                  loaded from bert.encoder.blocks.5.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias                     loaded from bert.encoder.blocks.5.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight                   loaded from bert.encoder.blocks.5.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias                     loaded from bert.encoder.blocks.5.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight                   loaded from bert.encoder.blocks.5.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias                       loaded from bert.encoder.blocks.5.norm1.bias                       of shape (768,)
2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight                     loaded from bert.encoder.blocks.5.norm1.weight                     of shape (768,)
2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias                       loaded from bert.encoder.blocks.5.norm2.bias                       of shape (768,)
2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight                     loaded from bert.encoder.blocks.5.norm2.weight                     of shape (768,)
2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias                   loaded from bert.encoder.blocks.6.attn.proj.bias                   of shape (768,)
2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight                 loaded from bert.encoder.blocks.6.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias                    loaded from bert.encoder.blocks.6.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight                  loaded from bert.encoder.blocks.6.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias                     loaded from bert.encoder.blocks.6.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight                   loaded from bert.encoder.blocks.6.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias                     loaded from bert.encoder.blocks.6.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight                   loaded from bert.encoder.blocks.6.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias                       loaded from bert.encoder.blocks.6.norm1.bias                       of shape (768,)
2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight                     loaded from bert.encoder.blocks.6.norm1.weight                     of shape (768,)
2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias                       loaded from bert.encoder.blocks.6.norm2.bias                       of shape (768,)
2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight                     loaded from bert.encoder.blocks.6.norm2.weight                     of shape (768,)
2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias                   loaded from bert.encoder.blocks.7.attn.proj.bias                   of shape (768,)
2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight                 loaded from bert.encoder.blocks.7.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias                    loaded from bert.encoder.blocks.7.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight                  loaded from bert.encoder.blocks.7.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias                     loaded from bert.encoder.blocks.7.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight                   loaded from bert.encoder.blocks.7.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias                     loaded from bert.encoder.blocks.7.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight                   loaded from bert.encoder.blocks.7.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias                       loaded from bert.encoder.blocks.7.norm1.bias                       of shape (768,)
2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight                     loaded from bert.encoder.blocks.7.norm1.weight                     of shape (768,)
2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias                       loaded from bert.encoder.blocks.7.norm2.bias                       of shape (768,)
2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight                     loaded from bert.encoder.blocks.7.norm2.weight                     of shape (768,)
2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias                   loaded from bert.encoder.blocks.8.attn.proj.bias                   of shape (768,)
2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight                 loaded from bert.encoder.blocks.8.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias                    loaded from bert.encoder.blocks.8.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight                  loaded from bert.encoder.blocks.8.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias                     loaded from bert.encoder.blocks.8.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight                   loaded from bert.encoder.blocks.8.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias                     loaded from bert.encoder.blocks.8.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight                   loaded from bert.encoder.blocks.8.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias                       loaded from bert.encoder.blocks.8.norm1.bias                       of shape (768,)
2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight                     loaded from bert.encoder.blocks.8.norm1.weight                     of shape (768,)
2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias                       loaded from bert.encoder.blocks.8.norm2.bias                       of shape (768,)
2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight                     loaded from bert.encoder.blocks.8.norm2.weight                     of shape (768,)
2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias                   loaded from bert.encoder.blocks.9.attn.proj.bias                   of shape (768,)
2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight                 loaded from bert.encoder.blocks.9.attn.proj.weight                 of shape (768, 768)
2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias                    loaded from bert.encoder.blocks.9.attn.qkv.bias                    of shape (2304,)
2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight                  loaded from bert.encoder.blocks.9.attn.qkv.weight                  of shape (2304, 768)
2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias                     loaded from bert.encoder.blocks.9.mlp.fc1.bias                     of shape (3072,)
2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight                   loaded from bert.encoder.blocks.9.mlp.fc1.weight                   of shape (3072, 768)
2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias                     loaded from bert.encoder.blocks.9.mlp.fc2.bias                     of shape (768,)
2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight                   loaded from bert.encoder.blocks.9.mlp.fc2.weight                   of shape (768, 3072)
2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias                       loaded from bert.encoder.blocks.9.norm1.bias                       of shape (768,)
2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight                     loaded from bert.encoder.blocks.9.norm1.weight                     of shape (768,)
2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias                       loaded from bert.encoder.blocks.9.norm2.bias                       of shape (768,)
2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight                     loaded from bert.encoder.blocks.9.norm2.weight                     of shape (768,)
2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias               loaded from bert.encoder.tag_blocks.0.attn.proj.bias               of shape (768,)
2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight             loaded from bert.encoder.tag_blocks.0.attn.proj.weight             of shape (768, 768)
2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias                loaded from bert.encoder.tag_blocks.0.attn.qkv.bias                of shape (2304,)
2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight              loaded from bert.encoder.tag_blocks.0.attn.qkv.weight              of shape (2304, 768)
2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias                 of shape (3072,)
2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias                 of shape (768,)
2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias                   loaded from bert.encoder.tag_blocks.0.norm1.bias                   of shape (768,)
2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight                 loaded from bert.encoder.tag_blocks.0.norm1.weight                 of shape (768,)
2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias                   loaded from bert.encoder.tag_blocks.0.norm2.bias                   of shape (768,)
2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight                 loaded from bert.encoder.tag_blocks.0.norm2.weight                 of shape (768,)
2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias               loaded from bert.encoder.tag_blocks.1.attn.proj.bias               of shape (768,)
2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight             loaded from bert.encoder.tag_blocks.1.attn.proj.weight             of shape (768, 768)
2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias                loaded from bert.encoder.tag_blocks.1.attn.qkv.bias                of shape (2304,)
2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight              loaded from bert.encoder.tag_blocks.1.attn.qkv.weight              of shape (2304, 768)
2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias                 of shape (3072,)
2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias                 of shape (768,)
2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias                   loaded from bert.encoder.tag_blocks.1.norm1.bias                   of shape (768,)
2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight                 loaded from bert.encoder.tag_blocks.1.norm1.weight                 of shape (768,)
2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias                   loaded from bert.encoder.tag_blocks.1.norm2.bias                   of shape (768,)
2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight                 loaded from bert.encoder.tag_blocks.1.norm2.weight                 of shape (768,)
2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias               loaded from bert.encoder.tag_blocks.2.attn.proj.bias               of shape (768,)
2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight             loaded from bert.encoder.tag_blocks.2.attn.proj.weight             of shape (768, 768)
2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias                loaded from bert.encoder.tag_blocks.2.attn.qkv.bias                of shape (2304,)
2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight              loaded from bert.encoder.tag_blocks.2.attn.qkv.weight              of shape (2304, 768)
2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias                 of shape (3072,)
2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias                 of shape (768,)
2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias                   loaded from bert.encoder.tag_blocks.2.norm1.bias                   of shape (768,)
2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight                 loaded from bert.encoder.tag_blocks.2.norm1.weight                 of shape (768,)
2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias                   loaded from bert.encoder.tag_blocks.2.norm2.bias                   of shape (768,)
2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight                 loaded from bert.encoder.tag_blocks.2.norm2.weight                 of shape (768,)
2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias               loaded from bert.encoder.tag_blocks.3.attn.proj.bias               of shape (768,)
2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight             loaded from bert.encoder.tag_blocks.3.attn.proj.weight             of shape (768, 768)
2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias                loaded from bert.encoder.tag_blocks.3.attn.qkv.bias                of shape (2304,)
2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight              loaded from bert.encoder.tag_blocks.3.attn.qkv.weight              of shape (2304, 768)
2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias                 of shape (3072,)
2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight               of shape (3072, 768)
2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias                 loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias                 of shape (768,)
2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight               loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight               of shape (768, 3072)
2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias                   loaded from bert.encoder.tag_blocks.3.norm1.bias                   of shape (768,)
2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight                 loaded from bert.encoder.tag_blocks.3.norm1.weight                 of shape (768,)
2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias                   loaded from bert.encoder.tag_blocks.3.norm2.bias                   of shape (768,)
2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight                 loaded from bert.encoder.tag_blocks.3.norm2.weight                 of shape (768,)
2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias                   loaded from bert.extra_embeddings.LayerNorm.bias                   of shape (768,)
2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight                 loaded from bert.extra_embeddings.LayerNorm.weight                 of shape (768,)
2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight       loaded from bert.extra_embeddings.position_embeddings.weight       of shape (512, 768)
2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight     loaded from bert.extra_embeddings.token_type_embeddings.weight     of shape (2, 768)
2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight           loaded from bert.extra_embeddings.word_embeddings.weight           of shape (30522, 768)
2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias                                 loaded from bert.pooler.dense.bias                                 of shape (768,)
2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight                               loaded from bert.pooler.dense.weight                               of shape (768, 768)
2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias                        loaded from bert.tag_logit.predictions.bias                        of shape (30522,)
2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight              loaded from bert.tag_logit.predictions.decoder.weight              of shape (30522, 768)
2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias    loaded from bert.tag_logit.predictions.transform.LayerNorm.bias    of shape (768,)
2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight  loaded from bert.tag_logit.predictions.transform.LayerNorm.weight  of shape (768,)
2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias        loaded from bert.tag_logit.predictions.transform.dense.bias        of shape (768,)
2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight      loaded from bert.tag_logit.predictions.transform.dense.weight      of shape (768, 768)
2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias                                   loaded from cls.predictions.bias                                   of shape (30522,)
2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight                         loaded from cls.predictions.decoder.weight                         of shape (30522, 768)
2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias               loaded from cls.predictions.transform.LayerNorm.bias               of shape (768,)
2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight             loaded from cls.predictions.transform.LayerNorm.weight             of shape (768,)
2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias                   loaded from cls.predictions.transform.dense.bias                   of shape (768,)
2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight                 loaded from cls.predictions.transform.dense.weight                 of shape (768, 768)
2022-03-17 14:04:57,040.040 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288
2022-03-17 14:04:57,040.040 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = []
2022-03-17 14:04:57,044.044 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1
2022-03-17 14:04:57,205.205 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight']
2022-03-17 14:04:57,525.525 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = <src.qd.data_layer.transform.LoadImage object at 0x7fcc8534b050>; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False
2022-03-17 14:04:57,553.553 2829:uni_pipeline.py:509 get_data_loader(): sampler = <src.qd.data_layer.samplers.DistributedSampler object at 0x7fcc853a6450>
2022-03-17 14:04:57,553.553 2829:uni_pipeline.py:1000    predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 14:04:57,627.627 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=<src.qd.data_layer.dataset.ImageIdxTSVDataset object at 0x7fcc85347fd0>, transform=Compose(
    <src.qd.data_layer.transform.LoadHW object at 0x7fcc85347a90>
    Compose(
    <src.qd.data_layer.transform.LoadImage object at 0x7fcc8534b050>
    ImageTransform2Dict(image_transform=Compose(
    <src.qd.data_layer.transform.BGR2RGB object at 0x7fcc8534be50>
    ToPILImage()
    Resize(size=384, interpolation=PIL.Image.BICUBIC)
    CenterCrop(size=(384, 384))
    ToTensor()
    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
))
)
    LoadLabel(data=TaxCocoCaption, split=test, version=vinvl)
    <src.qd.data_layer.transform.IdentifyTextAB object at 0x7fcba4e16cd0>
    TransCaptionTensorizer(tensorizer=<src.qd.mask.data.datasets.caption_tensorizer.CaptionTensorizer object at 0x7fcbcd9b5710>, pad_to_max=True, pad_image_to_max=True)
    <src.qd.pipelines.tagger_caption_uni_pipeline_expanding.Tensorizer object at 0x7fcc853a6590>
    <src.qd.data_layer.transform.RemoveUselessKeys object at 0x7fcc853a6550>
    <src.qd.data_layer.transform.RenameKey object at 0x7fcc853a6510>
))

uni_pipeline.py:908:   0%|          | 0/10 [00:00<?, ?it/s]/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1639: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

uni_pipeline.py:908:  10%|█         | 1/10 [00:05<00:52,  5.88s/it]
uni_pipeline.py:908:  20%|██        | 2/10 [00:11<00:43,  5.46s/it]
uni_pipeline.py:908:  30%|███       | 3/10 [00:16<00:38,  5.57s/it]
uni_pipeline.py:908:  40%|████      | 4/10 [00:21<00:32,  5.41s/it]
uni_pipeline.py:908:  50%|█████     | 5/10 [00:26<00:25,  5.08s/it]
uni_pipeline.py:908:  60%|██████    | 6/10 [00:31<00:20,  5.12s/it]
uni_pipeline.py:908:  70%|███████   | 7/10 [00:36<00:15,  5.14s/it]
uni_pipeline.py:908:  80%|████████  | 8/10 [00:41<00:10,  5.04s/it]
uni_pipeline.py:908:  90%|█████████ | 9/10 [00:47<00:05,  5.31s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:52<00:00,  5.22s/it]
uni_pipeline.py:908: 100%|██████████| 10/10 [00:53<00:00,  5.34s/it]
2022-03-17 14:05:52,104.104 2829:qd_common.py:625    cmd_run(): start to cmd run: nvidia-smi
2022-03-17 14:05:53,434.434 2829:qd_common.py:670    cmd_run(): finished the cmd run
2022-03-17 14:05:53,995.995 2829:uni_pipeline.py:1008    predict(): data: 0.0003 (0.0553)  input_to_cuda: 0.0029 (0.0030)  model: 5.1676 (5.2084)  write: 0.0045 (0.0045)
2022-03-17 14:06:08,054.054 2829:qd_common.py:1157 concat_files(): concating 0/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv
2022-03-17 14:06:08,055.055 2829:qd_common.py:1157 concat_files(): concating 1/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_1_32.tsv
2022-03-17 14:06:08,056.056 2829:qd_common.py:1157 concat_files(): concating 2/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_2_32.tsv
2022-03-17 14:06:08,056.056 2829:qd_common.py:1157 concat_files(): concating 3/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_3_32.tsv
2022-03-17 14:06:08,057.057 2829:qd_common.py:1157 concat_files(): concating 4/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_4_32.tsv
2022-03-17 14:06:08,057.057 2829:qd_common.py:1157 concat_files(): concating 5/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_5_32.tsv
2022-03-17 14:06:08,058.058 2829:qd_common.py:1157 concat_files(): concating 6/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_6_32.tsv
2022-03-17 14:06:08,058.058 2829:qd_common.py:1157 concat_files(): concating 7/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_7_32.tsv
2022-03-17 14:06:08,059.059 2829:qd_common.py:1157 concat_files(): concating 8/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_8_32.tsv
2022-03-17 14:06:08,130.130 2829:qd_common.py:1157 concat_files(): concating 9/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_9_32.tsv
2022-03-17 14:06:08,200.200 2829:qd_common.py:1157 concat_files(): concating 10/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_10_32.tsv
2022-03-17 14:06:08,271.271 2829:qd_common.py:1157 concat_files(): concating 11/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_11_32.tsv
2022-03-17 14:06:08,346.346 2829:qd_common.py:1157 concat_files(): concating 12/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_12_32.tsv
2022-03-17 14:06:08,417.417 2829:qd_common.py:1157 concat_files(): concating 13/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_13_32.tsv
2022-03-17 14:06:08,488.488 2829:qd_common.py:1157 concat_files(): concating 14/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_14_32.tsv
2022-03-17 14:06:08,558.558 2829:qd_common.py:1157 concat_files(): concating 15/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_15_32.tsv
2022-03-17 14:06:08,628.628 2829:qd_common.py:1157 concat_files(): concating 16/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_16_32.tsv
2022-03-17 14:06:08,698.698 2829:qd_common.py:1157 concat_files(): concating 17/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_17_32.tsv
2022-03-17 14:06:08,775.775 2829:qd_common.py:1157 concat_files(): concating 18/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_18_32.tsv
2022-03-17 14:06:08,846.846 2829:qd_common.py:1157 concat_files(): concating 19/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_19_32.tsv
2022-03-17 14:06:08,918.918 2829:qd_common.py:1157 concat_files(): concating 20/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_20_32.tsv
2022-03-17 14:06:08,989.989 2829:qd_common.py:1157 concat_files(): concating 21/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_21_32.tsv
2022-03-17 14:06:09,059.059 2829:qd_common.py:1157 concat_files(): concating 22/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_22_32.tsv
2022-03-17 14:06:09,130.130 2829:qd_common.py:1157 concat_files(): concating 23/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_23_32.tsv
2022-03-17 14:06:09,204.204 2829:qd_common.py:1157 concat_files(): concating 24/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_24_32.tsv
2022-03-17 14:06:09,276.276 2829:qd_common.py:1157 concat_files(): concating 25/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_25_32.tsv
2022-03-17 14:06:09,347.347 2829:qd_common.py:1157 concat_files(): concating 26/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_26_32.tsv
2022-03-17 14:06:09,417.417 2829:qd_common.py:1157 concat_files(): concating 27/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_27_32.tsv
2022-03-17 14:06:09,487.487 2829:qd_common.py:1157 concat_files(): concating 28/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_28_32.tsv
2022-03-17 14:06:09,557.557 2829:qd_common.py:1157 concat_files(): concating 29/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_29_32.tsv
2022-03-17 14:06:09,628.628 2829:qd_common.py:1157 concat_files(): concating 30/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_30_32.tsv
2022-03-17 14:06:09,700.700 2829:qd_common.py:1157 concat_files(): concating 31/32 - output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_31_32.tsv
2022-03-17 14:06:11,968.968 2829:tsv_io.py:66 reorder_tsv_keys(): loading keys in input
2022-03-17 14:06:11,994.994 2829:tsv_io.py:328 _ensure_lineidx_loaded(): loaded 5024 from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv.before.reorder.lineidx

tsv_io.py:67:   0%|          | 0/5024 [00:00<?, ?it/s]
tsv_io.py:67: 100%|██████████| 5024/5024 [00:00<00:00, 101992.61it/s]
2022-03-17 14:06:12,116.116 2829:tsv_io.py:70   gen_rows(): writing

tsv_io.py:71:   0%|          | 0/5000 [00:00<?, ?it/s]
tsv_io.py:71: 100%|██████████| 5000/5000 [00:00<00:00, 72208.27it/s]
2022-03-17 14:06:29,161.161 2829:qd_common.py:625    cmd_run(): start to cmd run: rsync output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv.speed.vis.txt.tmp --progress
model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv.speed.vis.txt

         32,768  33%    0.00kB/s    0:00:00  
         98,002 100%   62.21MB/s    0:00:00 (xfr#1, to-chk=0/1)
loading annotations into memory...
0:00:00.022433
creating index...
index created!
Loading and preparing results...     
DONE (t=0.01s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 307086 tokens at 1419874.29 tokens per second.
PTBTokenizer tokenized 59549 tokens at 507470.50 tokens per second.
setting up scorers...
computing Bleu score...
{'testlen': 49536, 'reflen': 48668, 'guess': [49536, 44536, 39536, 34536], 'correct': [38332, 21649, 10973, 5351]}
ratio: 1.0178351278046969
Bleu_1: 0.774
Bleu_2: 0.613
Bleu_3: 0.471
Bleu_4: 0.357
computing METEOR score...
METEOR: 0.288
computing Rouge score...
ROUGE_L: 0.576
computing CIDEr score...
CIDEr: 1.218
computing SPICE score...
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.nustaq.serialization.FSTClazzInfo (file:/mnt/batch/tasks/shared/LS_root/jobs/vlp32gb/azureml/zhiyuan-pytorch-test_1647405376_841ce8bd/mounts/vig_data/jianfw/data/qd_data/coco_caption/pycocoevalcap/spice/lib/fst-2.47.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of org.nustaq.serialization.FSTClazzInfo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Parsing reference captions
Parsing test captions
Initiating Stanford parsing pipeline
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... 
done [0.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [0.9 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.4 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].
Threads( StanfordCoreNLP ) [2.567 seconds]
Warning: Nashorn engine is planned to be removed from a future JDK release
SPICE evaluation took: 16.27 s
SPICE: 0.221
2022-03-17 14:07:08,365.365 2829:tagger_caption_uni_pipeline_expanding.py:1010   evaluate(): evaluation result: {'Bleu_1': 0.773821059431509, 'Bleu_2': 0.6133150028319614, 'Bleu_3': 0.470868830214028, 'Bleu_4': 0.35662820351172686, 'METEOR': 0.28775136892315173, 'ROUGE_L': 0.575864813028324, 'CIDEr': 1.2184653678861963, 'SPICE': 0.22075369053769664}
2022-03-17 14:07:08,365.365 2829:tagger_caption_uni_pipeline_expanding.py:1011   evaluate(): evaluation result saved to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.report
2022-03-17 14:07:08,366.366 2829:uni_pipeline.py:1290 pred_eval_intermediate_models(): not exist steps = []
2022-03-17 14:07:08,385.385 2829:uni_pipeline.py:1254 update_acc_iter(): create output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/map_TaxCocoCaption_test_Bleu_1.png
2022-03-17 14:07:08,623.623 2829:uni_pipeline.py:1254 update_acc_iter(): create output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/map_TaxCocoCaption_test_Bleu_2.png
2022-03-17 14:07:08,818.818 2829:uni_pipeline.py:1254 update_acc_iter(): create output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/map_TaxCocoCaption_test_Bleu_3.png
2022-03-17 14:07:09,012.012 2829:uni_pipeline.py:1254 update_acc_iter(): create output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/map_TaxCocoCaption_test_Bleu_4.png
2022-03-17 14:07:09,206.206 2829:uni_pipeline.py:1254 update_acc_iter(): create output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/map_TaxCocoCaption_test_METEOR.png
2022-03-17 14:07:09,387.387 2829:uni_pipeline.py:1254 update_acc_iter(): create output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/map_TaxCocoCaption_test_ROUGE_L.png
2022-03-17 14:07:09,584.584 2829:uni_pipeline.py:1254 update_acc_iter(): create output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/map_TaxCocoCaption_test_CIDEr.png
2022-03-17 14:07:09,777.777 2829:uni_pipeline.py:1254 update_acc_iter(): create output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/map_TaxCocoCaption_test_SPICE.png
03-17 14:07:16.004 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:59    cmd_run(): return code = 0


[2022-03-17T14:07:16.006873] The experiment completed successfully. Finalizing run...
03-17 14:07:16.007 42c07f8197104c3b988e50758ff54da200000C 249 context_manager_injector.py:84   __exit__(): Exiting context: UserExceptions
03-17 14:07:16.007 42c07f8197104c3b988e50758ff54da200000C 249 context_manager_injector.py:84   __exit__(): Exiting context: TrackUserError
03-17 14:07:16.007 42c07f8197104c3b988e50758ff54da200000C 249 context_manager_injector.py:84   __exit__(): Exiting context: RunHistory
03-17 14:07:16.007 42c07f8197104c3b988e50758ff54da200000C 249 context_manager_injector.py:84   __exit__(): Exiting context: ProjectPythonPath
[2022-03-17T14:07:16.007295] Finished context manager injector.